Member-only story

Get started web scraping with Scrapy and Python

First up install Pip

1 min readMar 24, 2022

Pip is a package manager for Python. There are a couple strange gotchas, the first is a highly active stackoverflow question that the install script for pip on Mac doesn’t quite work, instead you need to run:

$ python -m ensurepip --upgrade --user

With that you should have the “pip3” command available to you. Why it isn’t just “pip” has to do with a bunch of drama in the Python community I think but w/e we’ll just go with it.

$ pip3 --version
pip 21.3.1 from /usr/local/lib/python3.9/site-packages/pip (python 3.9)

Install Scrapy

Now that we have pip we can use it to install Scrapy:

$ pip3 install Scrapy

Then we see:

$ scrapy --version
Scrapy 2.6.1 - no active projectUsage:
  scrapy <command> [options] [args]Available commands:
  bench         Run quick benchmark test
  commands
  fetch         Fetch a URL using the Scrapy downloader
  genspider     Generate new spider using pre-defined templates
  runspider     Run a self-contained spider (without creating a project)
  settings      Get settings values
  shell         Interactive scraping console
  startproject  Create new project
  version       Print Scrapy version
  view          Open URL in browser, as seen by Scrapy[ more ]      More commands available when run from project directoryUse "scrapy <command> -h" to see more info about a command

Create a project

Now that we’ve got Scrapy installed we can follow the getting started instructions to make a new project.

This will create a project called “dataharvester”:

$ scrapy startproject dataharvester

Get started web scraping with Scrapy and Python

First up install Pip

Install Scrapy

Create a project

Written by Connor Leech

No responses yet