1.参考链接
博主的系列博客写的非常详尽,此处记录一下实践整合记录。
使用Python快速获取哥白尼数据开放访问中心购物车里的数据下载链接使用IDM批量下载Sentinel(哨兵)卫星数据使用sentinelsat包和IDM批量下载offline的sentinel数据
2. 下载链接爬取
前提:设置检索条件(起始时间,传感器【s2】,数据级别【L1C】),对检索出来的数据全部下载检索:搜索框输入
( footprint:
"Intersects(POLYGON((103.76445041927829 31.681927098519466,103.85257167382683 31.681927098519466,103.85257167382683 31.767318441848857,103.76445041927829 31.767318441848857,103.76445041927829 31.681927098519466)))" ) AND
( beginPosition:
[2017-06-01T00:00:00.000Z TO 2019-09-01T23:59:59.999Z
] AND endPosition:
[2017-06-01T00:00:00.000Z TO 2019-09-01T23:59:59.999Z
] ) AND
( (platformname:Sentinel-2 AND filename:S2A_* AND producttype:S2MSI1C
))
调节显示数量为最大(150),尽量将数据显示到一页,减少重复操作次数
保存网页,火狐中快捷键为ctrl+s,保存类型为 【网页,全部】
【检索页面 与 购物车中的区别说明】 检索页面👇 购物车中👇
爬取下载链接并保存(参考链接中代码有做修改)
from bs4
import BeautifulSoup
import pandas
as pd
import requests
filepath
='scihub.copernicus.eu.htm'
with open(filepath
,'rb') as f
:
ss
=f
.read
()
soup
=BeautifulSoup
(ss
,'html.parser')
divfind
=soup
.find_all
('div',class_
='list-link selectable')
linklist
=[]
idlist
=[]
for df
in divfind
:
link
=df
.find
('a').string
id=link
.split
('\'')[1]
linklist
.append
(link
)
idlist
.append
(id)
linkdataframe
=pd
.DataFrame
(linklist
)
iddataframe
=pd
.DataFrame
(idlist
)
with pd
.ExcelWriter
('Httpandid.xlsx') as hifile
:
linkdataframe
.to_excel
(hifile
,sheet_name
='HTTP',header
=False,index
=False)
iddataframe
.to_excel
(hifile
,sheet_name
='ID',header
=False,index
=False)
3. idm下载器设置
安装idm 官网下载链接,过试用期后需要注册 史上最简单的IDM破解教程,破解后使用不受限制打开idm,下载→选项→站点管理→新建→设置 路径(https://scihub.copernicus.eu)、用户名和密码下载→选项→连接→新建→设置最大连接数为1(哨兵数据账号对下载数量有要求) 编辑队列→设置 同时下载的文件数为2
4. 使用sentinelsat批量下载offline数据
sentinelsat包安装 ! 注意,在python3.6环境下,sentinelsat包的使用报错,不影响online数据的下载,但是无法触发offline数据。 !python3.8正常运行
pip
install sentinelsat
搜索 2中所存连接中已经在线的数据,并启动idm下载
from subprocess
import call
from sentinelsat
import SentinelAPI
from datetime
import date
import time
import xlrd
import os
from tqdm
import tqdm
IDM
= r
"C:\Program Files (x86)\Internet Download Manager\IDMan.exe"
DownPath
=r
'Sentinel2_Download'
if not os.path.exists
(DownPath
): os.mkdirs
(DownPath
)
api
= SentinelAPI
('账号',
'密码',
'https://scihub.copernicus.eu/dhus')
filepath
='Httpandid.xls'
workbook
= xlrd.open_workbook
(filepath
)
sheet1
= workbook.sheet_by_name
('HTTP')
linklist
=sheet1.col_values
(0
)
print
('开始任务:..................')
n
=0
while linklist:
print
('---------------------------------------------------')
n
=n+1
print
('\n')
print
('第'+str
(n
)+
'次循环'+
'\n\n')
id
=linklist
[0
].split
('\'')[1
]
link
=linklist
[0
]
product_info
=api.get_product_odata
(id
)
print
('检查当列表里的第一个数据:')
print
('数据ID为:'+id
)
print
('数据文件名为:'+product_info
['title']+
'\n')
if product_info
['Online']:
print
(product_info
['title']+
'为:online产品')
print
('加入IDM下载: '+link
)
call
([IDM,
'/d',link,
'/p',DownPath,
'/n',
'/a'])
linklist.remove
(link
)
call
([IDM,
'/s'])
else:
print
(product_info
['title']+
'为:offline产品')
print
('去激活它')
api.download
(id
)
print
('检查任务列表里是否存在online产品: .........')
if len
(linklist
)>1:
ilist
=[]
for i
in range
(1,len
(linklist
)):
id2
=linklist
[i
].split
('\'')[1
]
link2
=linklist
[i
]
product_info2
=api.get_product_odata
(id2
)
if product_info2
['Online']:
print
(product_info2
['title']+
'为在线产品')
print
('ID号为:'+id2
)
print
('加入IDM下载: '+link2
)
print
('--------------------------------------------')
call
([IDM,
'/d',link2,
'/p',DownPath,
'/n',
'/a'])
ilist.append
(link2
)
else:
continue
if len
(ilist
)>0:
call
([IDM,
'/s'])
for il
in ilist:
linklist.remove
(il
)
print
('本轮次检查结束,开始等到40分钟')
linklist.remove
(link
)
linklist.append
(link
)
for i
in tqdm
(range
(int
(1200
)),ncols
=100
):
time.sleep
(2
)