轻灵,内省,质朴,有为

读取文本文件时gets和read+split的速度比较

+1 投票

Tcl/Tk Insider群里面的@小李,问了一个问题。说是读入一个文本文件时,用gets的方式读取,和用read+split的方式读取,哪种方式的速度会快一些?

先说结论:

方案 时间 归一化
gets
24238.67
1.000
read + split
14123.58
0.583
binary + gets
17925.47
0.740
binary + read + split
12449.53
0.514


上测试代码:

Code: read.tcl
lassign $argv file COUNT

#-------------------------------------------#

puts "init system cache ..."
set fp [open $file "r"]
read $fp
close $fp

#-------------------------------------------#

set fout [open /dev/null "w"]

puts "time $COUNT {gets} ..."
puts [time {
  set fp [open $file "r"]
  while {[gets $fp line]>=0} {
    puts $fout $line
  }
  close $fp
} $COUNT ]

#-------------------------------------------#

puts "time $COUNT {read + split} ..."
puts [time {
  set fp [open $file "r"]
  foreach line [split [read $fp] "\n"] {
    puts $fout $line
  }
  close $fp
} $COUNT ]

#-------------------------------------------#

set fout [open /dev/null "w"]

puts "time $COUNT {binary + gets} ..."
puts [time {
  set fp [open $file "r"]
  fconfigure $fp -encoding binary -translation binary
  while {[gets $fp line]>=0} {
    puts $fout $line
  }
  close $fp
} $COUNT ]

#-------------------------------------------#

puts "time $COUNT {binary + read + split} ..."
puts [time {
  set fp [open $file "r"]
  fconfigure $fp -encoding binary -translation binary

  foreach line [split [read $fp] "\n"] {
    puts $fout $line
  }
  close $fp
} $COUNT ]

测试结果

输出:
% tclsh read.tcl test.txt 100
init system cache ...
time 100 {gets} ...
24238.67 microseconds per iteration
time 100 {read + split} ...
14123.58 microseconds per iteration
time 100 {binary + gets} ...
17925.47 microseconds per iteration
time 100 {binary + read + split} ...
12449.53 microseconds per iteration

附上生成测试用数据文件的代码:

Code: write.tcl
lassign $argv file

set fout [open $file "w"]
for {set i 1} {$i <= 12000} {incr i} {
  puts $fout [string repeat "1234567890abcdefgh " 7]
}
close $fout


最新提问 9月 2, 2015 分类:语法命令 | 用户: 风行水上 (-30 分)

登录 或者 注册 后回答这个问题。

...