【基础题】不用pandas读取csv文件的成绩数据处理题-CFANZ编程社区

学习总结

（1）用for的i默认是从0开始，如果想要要从1开始遍历，可以对后面的range处理
（2）题目是不用pandas对csv文件（数据之间是逗号间隔）处理，所以需要利用open后readlines后的每行数据，依次找到当前的第一个,位置，然后将前面用过的数据去掉。如下如处理：

with open("student.csv") as file:
    con = file.readlines() 
    # print(con)
    for i in range(1, len(con)):
        line = con[i]

文章目录

学习总结
一个简单题目的背景
一、不用pandas
二、pandas库
三、csv库
Reference

一个简单题目的背景

一个csv表有姓名、性别、成绩，用python得出男生和女生分别的的平均分，并且排序输出大于平均分的学生名单，不要直接用python自带的函数，最好自己敲排序和求平均数的函数

如果直接用pandas可以用pandas.read_csv函数或者使用csv，而如果只不能用pandas，则用open打开文件后用readlines读取csv文件的多行内容，注意读取数据时是以逗号划分，将每个学生的信息读取后存在字典列表内。另外代码中手写了快速排序对字典列表的学生成绩信息进行排序。并且最后输出大于平均分的男生和女生。

首先先自己写好csv文件的学生信息，每个数值之间是以逗号间隔，可以看到文件中排序前的学生信息（lstcopy）：

【基础题】不用pandas读取csv文件的成绩数据处理题_python

一、不用pandas

首先自己先随便写个csv文件的学生数据（csv文件的数据以逗号间隔）：

姓名,性别,成绩
佩奇哥,男,20
刘德华,男,100
张学友,女,40
张继科,男,80
杨幂,女,60

# -*- coding: utf-8 -*-
"""
Created on Wed Sep 29 17:07:02 2021

@author: 86493
"""
# 方法三改进：只使用open
classmate = {} 
lst = []
class student:
    def __init__(self, name, sex, score):
        self.name = name
        self.sex = sex
        self.score = score
        self.StudentDict = {'name': name,
                            'sex': sex,
                            'score': score}  
"""        
exp = "andyguo"
ret = exp.find(',')
print(ret) # -1表示没有找到结果
"""

with open("student.csv") as file:
    con = file.readlines() 
    # print(con)
    for i in range(1, len(con)):
        line = con[i]
        line = line.replace("\n", "") # 把最后一个\n去掉
        # print(line)
        pos1 = line.find(',')
        name = line[:pos1]
        sex = line[pos1 + 1]
        score = line[pos1 + 3:]
        stu = student(name, sex, score)
        # print(stu.StudentDict)
        classmate.update({stu.name: stu.StudentDict})
        lst.append(stu.StudentDict)

# print("全班同学的字典：\n", classmate)

# 求男女生的平均分
BoyNum, GirlNum, BoyScoreSum, GirlScoreSum = 0, 0, 0, 0
EndBoyLst, EndGirlLst = [], []

for astu in lst:
    astu['score'] = int(astu['score'])
    if astu['sex'] == '男':
        # print(astu['score'])
        # print(type(astu['score'])) # 如果没加上int那句则是<class 'str'>
        BoyScoreSum += astu['score']
        BoyNum += 1
        #EndBoyLst.append(astu['name']) # 加入男生结果列表
    elif astu['sex'] == '女':
        GirlScoreSum += astu['score'] 
        GirlNum += 1
        #EndGirlLst.append(astu['name'])
        
baverage = BoyScoreSum / BoyNum
gaverage = GirlScoreSum / GirlNum 
print("男生平均分为：", baverage)
print("女生平均分为：", gaverage, "\n") 

# ————————————
# 排序法二：手写排序，此处用快速排序算法
# print(classmate['佩奇哥']) # 字典的下标索引不能是0,1,2...
def QuickSort(lst1,i,j):#i左，j为右
    if i >= j:
        return lst1
    pivot = lst1[i]
    low = i
    high = j

    # 划分枢轴
    while i < j:
        while i < j and lst1[j]['score'] >= pivot['score']:
            j -= 1
        lst1[i] = lst1[j]
        while i < j and lst1[i]['score'] <= pivot['score']:
            i += 1
        lst1[j] = lst1[i]
    lst1[j] = pivot
    
    QuickSort(lst1,low,i-1)
    QuickSort(lst1,i+1,high)
    return lst1

lstcopy = lst.copy() 
endlst = QuickSort(lst, 0, len(lst) - 1)
print("排序后的学生信息:\n", endlst, "\n")
# ————————————

# 判断并存入结果
for astu in endlst:
    astu['score'] = int(astu['score'])
    if astu['sex'] == '男' and astu['score'] > baverage:
        EndBoyLst.append(astu['name']) # 加入男生结果列表
    elif astu['sex'] == '女' and astu['score'] > gaverage:
        EndGirlLst.append(astu['name'])
print("过平均分的男生：\n", EndBoyLst)
print("过平均分的女生：\n", EndGirlLst)

# 排序法一：利用sorted对字典的value进行排序
# changelst = sorted(lst, key = lambda lst: lst['score'])



"""
if __name__=="__main__":
    lst1=[30,24,5,58,18,36,12,42,39]
    print("排序前的序列为：")
    for i in lst1:
        print(i,end =" ")
    print("\n排序后的序列为：")
    for i in QuickSort(lst1,0,len(lst1)-1):
        print(i,end=" ")
"""

结果为：

男生平均分为： 66.66666666666667
女生平均分为： 50.0 

排序后的学生信息:
 [{'name': '佩奇哥', 'sex': '男', 'score': 20}, 
 {'name': '张学友', 'sex': '女', 'score': 40}, 
 {'name': '杨幂', 'sex': '女', 'score': 60}, 
 {'name': '张继科', 'sex': '男', 'score': 80}, 
 {'name': '刘德华', 'sex': '男', 'score': 100}] 

过平均分的男生：
 ['张继科', '刘德华']
过平均分的女生：
 ['杨幂']

排序后的学生信息（endlst）如下，可以看到成绩已经从小到大排列了，然后分别求出大于平均分的男生和女生名单。

【基础题】不用pandas读取csv文件的成绩数据处理题_python_02

二、pandas库

# -*- coding: utf-8 -*-
"""
Created on Wed Sep 29 15:00:45 2021

@author: 86493
"""
# 方法一：用pandas读取csv
import pandas as pd
file = pd.read_csv('student.csv', encoding = 'GB18030')
df = pd.DataFrame(file)
print(df)
print(df.shape) # (5, 3)即五行三列
classmate = {} 
 

class student:
    def __init__(self, name, sex, score):
        self.name = name
        self.sex = sex
        self.score = score
        self.StudentDict = {'name': name,
                            'sex': sex,
                            'score': score}
      
for i in range(df.shape[0]):
    print(i)
    name = df.loc[i, '姓名']
    sex = df.loc[i, '性别']
    score = df.loc[i, '成绩']
    stu = student(name, sex, score)
    # print(stu.StudentDict)
    classmate.update({stu.name: stu.StudentDict})

print(classmate)

结果为：

0  佩奇哥  男   20
1  刘德华  男  100
2  张学友  女   40
3  张继科  男   80
4   杨幂  女   60
(5, 3)
0
1
2
3
4
{'佩奇哥': {'name': '佩奇哥', 'sex': '男', 'score': 20}, 
'刘德华': {'name': '刘德华', 'sex': '男', 'score': 100}, 
'张学友': {'name': '张学友', 'sex': '女', 'score': 40}, 
'张继科': {'name': '张继科', 'sex': '男', 'score': 80}, 
'杨幂': {'name': '杨幂', 'sex': '女', 'score': 60}}

三、csv库

# -*- coding: utf-8 -*-
"""
Created on Wed Sep 29 16:48:30 2021

@author: 86493
"""
# 方法二：用csv函数读取csv
import numpy as np
import csv
classmate = {} 

class student:
    def __init__(self, name, sex, score):
        self.name = name
        self.sex = sex
        self.score = score
        self.StudentDict = {'name': name,
                            'sex': sex,
                            'score': score}  

with open('student.csv', encoding = 'GB18030')as csvfile:
    spamreader = csv.reader(csvfile)
    for row in spamreader:
        print(row)
        print(row[0])
        name = row[0]
        sex = row[1]
        score = row[2]
        stu = student(name, sex, score)
        # print(stu.StudentDict)
        classmate.update({stu.name: stu.StudentDict})
                
print(classmate)

结果为：

['姓名', '性别', '成绩']
姓名
['佩奇哥', '男', '20']
佩奇哥
['刘德华', '男', '100']
刘德华
['张学友', '女', '40']
张学友
['张继科', '男', '80']
张继科
['杨幂', '女', '60']
杨幂
{'姓名': {'name': '姓名', 'sex': '性别', 'score': '成绩'}, 
'佩奇哥': {'name': '佩奇哥', 'sex': '男', 'score': '20'}, 
'刘德华': {'name': '刘德华', 'sex': '男', 'score': '100'}, 
'张学友': {'name': '张学友', 'sex': '女', 'score': '40'}, 
'张继科': {'name': '张继科', 'sex': '男', 'score': '80'}, 
'杨幂': {'name': '杨幂', 'sex': '女', 'score': '60'}}

Reference

（1）python读取cvs文件内容如何区分分隔逗号和内容中的逗号（2）python读取csv文件的三种方式