使用Python统计文件的行数

发表时间：2018-05-07 15:16 | 分类：Python | 浏览：1,353 次

Linux下使用wc可以很快的统计文件行数，速度也比较快，例如：

wc -l /etc/passwd

使用python怎么统计？有几个思路。

方法1：读文件统计

例如：

#!/usr/bin/env python
#encoding:utf8
import time
start_time = time.time()
with open("/tmp/nbhao.org.log", "r") as f:
    print sum( line.count("\n") for line in f)
print time.time() - start_time,"seconds"

这个方法统计速度慢。

方法2：调用wc命令统计

例如：

#!/usr/bin/env python
#encoding:utf8
import time,os
start_time = time.time()
os.system("wc -l /nbhao.org.log")
print time.time() - start_time,"seconds"

这个方法不适用windows，可迁移性查。

方法3：分块读文件

例如：

#!/usr/bin/env python
#encoding:utf8

import time
start_time = time.time()
 
logs_file = open("/tmp/nbhao.org.log",'r')
count = 0
while True:
    buffer = logs_file.read(1024 * 8192)
    if not buffer:
        break
    count += buffer.count('\n')
logs_file.close()
print count
 
print time.time() - start_time,"seconds"

方法4：使用enumerate

例如：

#!/usr/bin/env python
#encoding:utf8

import time
start_time = time.time()

count = -1
for count,line in enumerate(open("/tmp/nbhao.org.log",'r')):
    pass
count += 1
print count
 
print time.time() - start_time,"seconds"

方法4与方法3差不多，但仔细观察发现方法3比方法4要耗内存。

避免使用这个方法，会耗内存。

count = len (open ('/tmp/nbhao.org.log','r' ).readlines())

参考：https://www.linuxhub.org/?p=3104

本文标签：Python

本文链接：https://www.sijitao.net/2652.html

欢迎您在本博客中留下评论，如需转载原创文章请注明出处，谢谢！

下一篇：CentOS 7 安装ClamAV杀毒软件步骤
上一篇：Windows 2003系统安装PHP 5.3的步骤

日历
2025年四月

一二三四五六日

« 十

1 2 3 4 5 6

7 8 9 10 11 12 13

14 15 16 17 18 19 20

21 22 23 24 25 26 27

28 29 30
标签
360 apache CentOS chrome Fail2ban find Firefox GAE Gmail Google htaccess Life linux MongoDB MSN Mysql nagios Nginx PHP Postfix PostgresQL Python QQ Redis SEO Shell SQL ssl tomcat ubuntu virtualbox VPS windows Wordpress XML Zabbix 主机代理发牢骚域名小百科搜索热门百度邮箱

2025年四月
一	二	三	四	五	六	日
« 十
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30

使用Python统计文件的行数

方法1：读文件统计

方法2：调用wc命令统计

方法3：分块读文件

方法4：使用enumerate

日历

标签

最新发表