# Recognizing Handwritten Digits with Scikit-Learn

Recognizing handwritten text is a problem that can be traced back to the first automatic machines that needed to recognize individual characters in handwritten documents. Think about, for example, the ZIP codes on letters at the post office and the automation needed to recognize these five digits. Perfect recognition of these codes is necessary to sort mail automatically and efficiently.

But the problem of handwriting recognition goes farther back in time, more precisely to the early 20th Century (the 1920s), when Emanuel Goldberg (1881–1970) began his studies regarding this issue and suggested that a statistical approach would be an optimal choice.

The scikit-learn library (http://scikit-learn.org/) enables you to approach this type of data analysis.

The problem we have to face involves predicting a numeric value, and then reading and interpreting an image that uses a handwritten font. So in this case we will have an estimator with the task of learning through a fit() function, and once it has reached a degree of predictive capability (a model sufficiently valid), it will produce a prediction with the predict() function. Then we will discuss the training set and validation set created this time from a series of images.

**The Digits Dataset**

The scikit-learn library provides numerous datasets that are useful for testing many problems of data analysis and prediction of the results. Also in this case there is a dataset of images called Digits. This dataset consists of 1,797 images that are 8x8 pixels in size.

** Step 1**.We will be using an estimator that is useful in this case is sklearn.svm.SVC, which uses the technique of Support Vector Classification (SVC).

You have to import the svm module of the scikit-learn library. You can create an estimator of SVC type and then choose an initial setting, assigning the values C and gamma generic values. Then load the digits dataset.

** Step 2**.You can analyze the content using

**print(digits.DESCR)**

The images of the handwritten digits are contained in a **digits.images** array.

** Step 3**. The numerical values represented by images, i.e., the targets, are contained in the

**digit.target**array.

You can see **digits.target.size **to know the size of your target array.

** Step 4**.You can visualize your dataset using matplotlib library.

** Step 5**. Now you can train the svc estimator that you defined earlier

To test our estimator, making it interpret the six digits of the validation set **svc.predict** is used.

As we can see that the svc estimator has learned correctly. It is able to recognize the handwritten digits, interpreting correctly all six digits of the validation set.