Tcl/Tk Insider群里面的@小李,问了一个问题。说是读入一个文本文件时,用gets的方式读取,和用read+split的方式读取,哪种方式的速度会快一些?
先说结论:
上测试代码:
lassign $argv file COUNT #-------------------------------------------# puts "init system cache ..." set fp [open $file "r"] read $fp close $fp #-------------------------------------------# set fout [open /dev/null "w"] puts "time $COUNT {gets} ..." puts [time { set fp [open $file "r"] while {[gets $fp line]>=0} { puts $fout $line } close $fp } $COUNT ] #-------------------------------------------# puts "time $COUNT {read + split} ..." puts [time { set fp [open $file "r"] foreach line [split [read $fp] "\n"] { puts $fout $line } close $fp } $COUNT ] #-------------------------------------------# set fout [open /dev/null "w"] puts "time $COUNT {binary + gets} ..." puts [time { set fp [open $file "r"] fconfigure $fp -encoding binary -translation binary while {[gets $fp line]>=0} { puts $fout $line } close $fp } $COUNT ] #-------------------------------------------# puts "time $COUNT {binary + read + split} ..." puts [time { set fp [open $file "r"] fconfigure $fp -encoding binary -translation binary foreach line [split [read $fp] "\n"] { puts $fout $line } close $fp } $COUNT ]
测试结果
% tclsh read.tcl test.txt 100 init system cache ... time 100 {gets} ... 24238.67 microseconds per iteration time 100 {read + split} ... 14123.58 microseconds per iteration time 100 {binary + gets} ... 17925.47 microseconds per iteration time 100 {binary + read + split} ... 12449.53 microseconds per iteration
附上生成测试用数据文件的代码:
lassign $argv file set fout [open $file "w"] for {set i 1} {$i <= 12000} {incr i} { puts $fout [string repeat "1234567890abcdefgh " 7] } close $fout