https://pypi.org/project/icrawler/

 

icrawler

A mini framework of image crawlers

pypi.org

 

https://icrawler.readthedocs.io

 

Welcome to icrawler — icrawler 0.6.6 documentation

Architecture A crawler consists of 3 main components (Feeder, Parser and Downloader), they are connected with each other with FIFO queues. The workflow is shown in the following figure. url_queue stores the url of pages which may contain images task_queue

icrawler.readthedocs.io

from icrawler.builtin import GoogleImageCrawler

google_crawler = GoogleImageCrawler(
    feeder_threads=1,
    parser_threads=1,
    downloader_threads=4,
    storage={'root_dir': 'your_image_dir'})
filters = dict(
    size='large',
    color='orange',
    license='commercial,modify',
    date=((2017, 1, 1), (2017, 11, 30)))
google_crawler.crawl(keyword='고양이', language="kr", filters=filters, offset=0, max_num=1000,
                     min_size=(200,200), max_size=None, file_idx_offset=0)

아..

닭사진도 나온다..크롤링....

아 근데.. 

 

How do I scrape an image URL in Python? ???
 

+ Recent posts