
上QQ阅读APP看书,第一时间看更新
Chapter 1. Preparing the Data
In this chapter, we will cover the basic tasks of reading, storing, and cleaning data using Python and OpenRefine. You will learn the following recipes:
- Reading and writing CSV/TSV files with Python
- Reading and writing JSON files with Python
- Reading and writing Excel files with Python
- Reading and writing XML files with Python
- Retrieving HTML pages with pandas
- Storing and retrieving from a relational database
- Storing and retrieving from MongoDB
- Opening and transforming data with OpenRefine
- Exploring the data with OpenRefine
- Removing duplicates
- Using regular expressions and GREL to clean up the data
- Imputing missing observations
- Normalizing and standardizing features
- Binning the observations
- Encoding categorical variables