
上QQ阅读APP看书,第一时间看更新
How it works...
We begin by reading in our dataset (step 1), which consists of the PE header information for a collection of PE files. These vary greatly, with some columns reaching hundreds of thousands of files, and others staying in the single digits. Consequently, certain models, such as neural networks, will perform poorly on such unstandardized data. In step 2, we instantiate StandardScaler() and then apply it to rescale X using .fit_transform(X). As a result, we obtained a rescaled dataset, whose columns (corresponding to features) have a mean of 0 and a variance of 1.