https://pypi.org/project/icrawler/
icrawler
A mini framework of image crawlers
pypi.org
https://icrawler.readthedocs.io
Welcome to icrawler — icrawler 0.6.6 documentation
Architecture A crawler consists of 3 main components (Feeder, Parser and Downloader), they are connected with each other with FIFO queues. The workflow is shown in the following figure. url_queue stores the url of pages which may contain images task_queue
icrawler.readthedocs.io
from icrawler.builtin import GoogleImageCrawler
google_crawler = GoogleImageCrawler(
feeder_threads=1,
parser_threads=1,
downloader_threads=4,
storage={'root_dir': 'your_image_dir'})
filters = dict(
size='large',
color='orange',
license='commercial,modify',
date=((2017, 1, 1), (2017, 11, 30)))
google_crawler.crawl(keyword='고양이', language="kr", filters=filters, offset=0, max_num=1000,
min_size=(200,200), max_size=None, file_idx_offset=0)
아..
닭사진도 나온다..크롤링....
아 근데..
How do I scrape an image URL in Python? ???
'Study > Python' 카테고리의 다른 글
python pandas read csv 에러해결 (0) | 2022.05.17 |
---|---|
icrawler 써보려고 하는데....macos pip upgrade부터 하라네 (0) | 2022.05.17 |
파이썬 초보자 - 모르는 사람 블로그 가져오기 크롤링/스크래핑 (0) | 2022.01.19 |