python scrapy持久化保存(保存到Mysql文件中)

    科技2022-07-21  107

    保存到mysql中和保存到excel文件中几乎一毛一样,怎么创建scrapy项目以及一些配置,可以先参考这个链接的内容:

    https://blog.csdn.net/Di77HaoWenMing/article/details/108905515 https://blog.csdn.net/Di77HaoWenMing/article/details/108914692

    然后只需要在pipelines.py文件中添加个新的类即可,我下面就是新加了个SaveMysql的类:

    import pandas as pd import pymysql class YxqPipeline: def process_item(self, item, spider): filmname = item['filmname'] mydata = pd.DataFrame({'电影名称':filmname}) mydata.to_excel('E:/filmname.xlsx',index=False) return item class SaveMysql: def process_item(self, item, spider): conn = pymysql.connect(host='localhost', user='root', password='123456', database='demo', charset='utf8') cur = conn.cursor() sql = 'insert into filmname(name) values (%s)' cur.executemany(sql,item['filmname']) conn.commit() cur.close() conn.close() return item

    如果对mysql的设置或者爬完后怎么写入还不清楚,可以参考下面的链接:

    https://blog.csdn.net/Di77HaoWenMing/article/details/108876851 https://blog.csdn.net/Di77HaoWenMing/article/details/108835436

    最后,在setting文件中进行配置,才可以正常调用这个新增加的SaveMysql类哟

    ITEM_PIPELINES = { 'yxq.pipelines.YxqPipeline': 300, 'yxq.pipelines.SaveMysql': 301 }

    运行后,搞定!

    Processed: 0.010, SQL: 8