Intelligent Character Recognition System Using Convolutional Neural Network

Computational Linguistics involves the techniques of Computer Science which play a vital role in recognizing written or printed characters such as numbers or letters to change them into a form that the computer can use it efficiently. Convolutional Neural Network differs from other approaches by extracting the features automatically. The proposed approach is capable of recognizing characters in a variety of challenging conditions using the Convolutional Neural Network, where traditional character recognition systems fail, notably in the presence of low resolution, substantial blur, low contrast, and other distortions. Intellectual Character Recognition System is an application that uses Convolutional Neural Network (CNN) to recognize the Tamil character dataset accurately developed by HP Labs India. The novelty of this system is that, it recognizes the characters of the Predominant Tamil language. With the help of suitable datasets consisting of the Tamil Scripts, the model is trained efficiently. This work has produced a training accuracy of 99.16% which is far better compared to the traditional approaches.


Introduction
Recognition of handwritten characters by machine is becoming more and more relevant in the modern world. Handwritten record acknowledgment is the capacity of a computer to get and decipher clear manually written information. The purpose of this project is to take Tamil handwritten characters as input and to recognize the character. The major challenges, as in the case of any handwritten character recognition problem, is the large variation in the writing styles of the individual at different times and among various people, for example, size shape, speed of composing and thickness of characters, and so on. The problem of printed character recognition is relatively * Corresponding author. Email: suriyas84@gmail.com well understood and solved with little constraints and available system yield as good as maximum recognition accuracy. But transcribed character acknowledgment frameworks have still restricted abilities. Other difficulties include the similarity of some characters with each other, an infinite variety of character shapes, etc. Handwritten Character Recognition is one of the most popular areas of research in pattern recognition because of its immense application potential. Handwritten character recognition is relatively matured in languages like English, Chinese, Korea, Japanese, and Arabic. In Indian languages, studies are active in Devanagari and Bangla. Some promising research findings are reported in south Indian (Dravidian) languages like Telugu and Kannada. In Tamil, though many research papers are published in this area, the results reported are inadequate for the design of efficient Handwritten Character Recognition systems. This is the motivation behind this thesis. Linguistic is the scientific study of human language, the main aim is to establish a theory by studying nature of the language and by applying the established theory to describe other languages. It involves analyzing language form, language meaning, and language in context. Linguists traditionally analyze human language by observing interplay between sound and meaning. Computational linguistics (CL) is the application of computer science to the analysis, synthesis and comprehension of written and spoken language. It is an interdisciplinary field dealing with the statistical and logical modeling of natural language from a computational perspective. Computational linguistics is used in instant machine translation, speech recognition (SR) systems, text-to-speech (TTS) synthesizers, interactive voice response (IVR) systems, search engines, text editors and language instruction materials. The interdisciplinary field of study requires expertise in machine learning (ML), deep learning (DL), artificial intelligence (AI), cognitive computing and neuroscience. The main goal of CL is to make the computers understand human language since computer language doesn't match structure of human thought. Computational Linguistics is closely related to Natural Language Technology, Natural Language Engineering, Natural Language Processing and Artificial Intelligence. A computational understanding of language provides human beings with insight into thinking and intelligence. Computers that are linguistically competent not only help facilitate human interaction with machines and software, but also make the textual and other resources of the internet readily available in multiple languages.

Literature Survey
Nicole Dalia Cilia et al. [1] proposed A ranking-based feature selection approach for handwritten character recognition using an new selection approach for feature extraction. Its aim is to reduce the computational cost of the classification task, in an attempt to increase, or not to reduce, the classification performance. There were several limitations, which include the computational complexity, the dependence on the adopted classifiers and the difficulty in evaluating the interactions among features. In this approach several drawbacks were overcome by adopting feature-ranking-based techniques and different univariate measures were introduced to produce a feature ranking. Greedy search approaches were proposed for choosing the feature subset able to maximize the classification results. The computational cost of the procedures was high for searching effective feature subsets, and the difficulties were faced to take into account the effects of the interactions among features. In order to eradicate these drawbacks ranking techniques for feature extractions has been combined using greedy approaches to select feature subsets with an increasing number of features, obtained by adding features progressively according to their position in the ranking. By significantly reducing the computational complexity of the whole recognition system with very limited effects on the classification performance with the error rate of 21%. The result of this work shows that it is possible to reduce significantly the number of features, and consequently the complexity of the classification tasks, accepting a limited reduction of the recognition rate.
Manpreet Kaur et al. [2] introduced Proposed Approach for Layout & Handwritten Character Recognition for recognizing character from the datasets which contains heterogeneous data. Radial Sector Coding algorithm has been used in this approach for the detection of arbitrarily oriented text in an image. Information energy approach is being used to segment the lines that can be embedded. To achieve rotation invariance was a challenging task in this approach and was achieved it by finding Axis of Reference which is a rotation invariant feature for all characters then the Line of Reference from Axis of Reference which is considered as 0 lines for feature generation and thus generated Translation, Rotation and Scale invariant features. The 26 uppercase English characters were used for performance evaluation of RSC and different rotated scaled versions of Arial and Tahoma fonts for each character were used. The two sets of characters were used one set for training of artificial neural network and other set is used for performance test and the imposed algorithm is implemented in suitable environment of MATLAB. This method applied on the document which contains the three type of language Arabic, English, French is in the form of printed text and handwritten text. The first step in this preprocessing step is used for the noise removal of small connected component (cc). In the second step the text & non text classification is done on based of learning based approach on the basis of cc and its neighborhood to the feed an MLP classifier. Third step is layer separation which is used to classify the typed text and handwritten on the behalf of code book method. The fourth step is block segmentation. In this first step include applying RLSA algorithm to connect close cc and the second step is segment the document by white space to filter the small rectangle. The main motive is to make this approach user specific so that the shopkeeper or the other person who is not so computer friendly store and analysis their bills.More enhanced techniques were applied on the images for better analysis of the heterogeneous images.
Yongchun Zhu et al. [3] proposed an Adaptively Transfer Category-Classifier for Handwritten Chinese Character Recognition and introduced a new neural network structure for Handwritten Chinese Character Recognition (HCCR) to make full use of a large amount of labeled source domain data and a small number of target domain data to learn the model parameters. Furthermore transfer on the categoryclassifier level, and adaptively assign different weights were used to category-classifiers according to the usefulness of source domain data. The experiments were constructed from three data sets demonstrate the effectiveness of the model compared with several state-of-the-art baselines. Transfer learning aims to adapt the knowledge from related source domain data to the model learning in the target domain, which provides the possibility of success for HCCR tasks, a transfer handwritten Chinese character recognition model has been developed based on the successful deep network structure AlexNet. Specifically, for both source and target domain data, the network parameters shared with five convolution layers and three pooling layers, and then learn the parameters of three fully connected layers separately. In addition, to adaptively transfer the category knowledge from the source domain to the target domain, a regularization item were imposed with different weights to learn the similarity of category-classifiers trained from the source to the target domain and extensive experiments on three data sets to validate the effectiveness of our model was calculated finally. The approach was tested on the HCL2000, CASIA-HWDB1.1 and MSS-HCC datasets. This dataset contains 6600 shapes of handwritten characters written by 50 persons. The dataset is divided into a training set of 5280 images and a test set of 1320 images. The result was promising with an error classification rate of 2.1%.
Raymond Ptucha et al. [4] proposed Character recognition using fully convolutional neural networks, the preprocessing step in this approach normalizes the input blocks to a canonical representation which negates the need for costly recurrent symbol alignment correction. Character based classification is implied without relying on predefined dictionaries or contextual information. This work focuses on extracted handwritten symbol blocks and the input is defined as a tightly cropped gray scale image of an arbitrary 1D sequence of symbols, the word symbol emphasizes the model which was not limited to Latin based characters, neither spaces between characters, nor the need to predict space. Convolution filters alternately align on symbol and blanks, it can be seen perfect alignment requires each symbol to be of identical width. Wider filters ensure the complete symbol is in the receptive field as the filter steps across the word block. The RIMES dataset were used which contains 60,000 French words, by over 1000 authors. There are several versions of the RIMES dataset, where each newer release is a super-set of prior releases. Generically trained models, while not as good as those fine-tuned for a particular dataset performed quite well. The size of the lexicon used by the Lexicon CNN to be application dependent and comparatively Lexicon CNN to less than 2000 words as the fully convolutional Symbol Prediction FCN is performing very well. This work demonstrates how to replace recurrent neural networks with fully convolutional methods when processing variable length temporal streams of offline handwriting imagery. These streams are firstly broken into their constituent parts, where each part is measured in length and resampled to a canonical representation compatible with a fully convolutional network. This divide and conquer fully convolutional approach is input length agnostic, and does not suffer from exploding or vanishing gradients.
ThirumalaiMurugan et al. [5] propose A Novel Approaches to Handwritten English Character Alphabet and Number Recognition Based on Neural Networks, due to variation in shape, slope and size of individual characters they are handled by better pre-processing and feature extraction techniques. The Feed Forward Algorithm gives insight into the enter workings of a neural network; followed by the Back Propagation Algorithm which compromises Training and Testing. The performances of Handwritten English Character Alphabet and Number Recognition Based on Neural Networks by using Geometric Feature Extraction and Gradient Technique. The Edge Detection Algorithm was used which has a list called traverse list. It is the list of pixel already traversed by the algorithm. A geometry based technique for feature extraction applicable to segmentation-based word recognition systems. The proposed system extracts the geometric features of the character contour. This feature is based on the basic line types that form the character skeleton. The system gives a feature vector as its output. The feature vectors the generated from a training set were then used to train a pattern recognition engine based on Neural Networks so that the system. A gradient feature vector is composed of the strength of gradient accumulated separately in different directions. In Neural Network, each node perform some simple computation and each connection conveys a signal from one node to another labeled by a number and extended to which signal is amplified. Input vectors and the corresponding target vectors are used to train a network until it can approximate a function, associate input vectors with specific output vectors, or classify input vectors in an appropriate way as defined. Networks with biases, a sigmoid layer, and a linear output layer are capable of approximating any function with a finite number of discontinuities.T he result shows that the back propagation network provides good recognition accuracy of more than 90% of handwritten English characters.
Muna Ahmed Awel et al. [6] imposed a Review on Optical Character Recognition, this papers is on review of some researches has been made in English, Arabic and Devanagari characters and the methodology used and challenges faced during development of Optical character recognition. For feature extraction global geometrics and geometric density classifier were used and the evaluation of the system has achieved for Geometric Density 77.89% and Geometric Feature 76.44% accuracy rate. Support vector system (SVM) is selected for recognition and the method recognized for the small dataset was of 86% accuracy. From the review of related papers the major steps used was preprocessing, segmentation, feature extraction and post processing. Challenges faced during recognition process were Scene Complexity, Conditions of Uneven Lighting, Skewness (Rotation), Blurring and Degradation, Fonts and style and Multilingual Environments. In the research works revised in this paper, character recognition system use different approaches and many of them get good accuracy and the feature extraction techniques should be choose according to the character you working because each scripts or alphabets has its own nature therefor need to find techniques which fit or suitable for characters. The better able to extract features from character more we can detect and recognize characters in highest accuracy.
AnupamGarg et al. [7] proposed Offline handwritten Gurmukhi character recognition: k-NN vs. SVM classifier to analyze the impact of combination of feature extraction and classification techniques. Also principal component analysis (PCA) has been used to find efficient features from peak extent based and modified division point (MDP) based features which have further been used in the classification process. For classification, k-NN and SVM whose three different kernels, namely, linear-SVM, polynomial-SVM and RBFSVM have been considered for recognition accuracy in this paper. For selecting the training and testing dataset, five different partitioning strategies and k-fold cross validation techniques have been used. The experimentation has been performed on the dataset of 8960 samples of offline handwritten Gurmukhi characters written by 160 unique writers.
The proposed framework consists of the stages, namely, digitization, pre-processing, feature extraction, and classification. Digitization is the process of converting the paper based handwritten document into electronic format. Digitization produces the digital image, which is fed to the pre-processing phase. During pre-processing, we have converted the digitized image to thinned image (stroke width single pixel).The partitioning strategy and k-fold cross validation technique for selecting the training and the testing patterns have been experimented in this work. The classifiers that have been employed in this work are k-NN, Linear-SVM, Polynomial-SVM and RBF-SVM and combinations of this using database partitioning strategy and five-fold cross validation. A recognition accuracy of 92.3% has been achieved, using the combination of linear-SVM, polynomial-SVM and k-NN classifiers and with the partitioning strategy of 80% data as training dataset and remaining data as testing dataset. AriyonoSetiawan et al. [8] proposed Handwriting Character Recognition Javanese Letters Based on Artificial Neural Network; Javanese character is one of Indonesia's cultural heritages that must be preserved. Manuscript form of the Javanese character is one of the priceless inheritances. Javanese characters are often called the Hanacaraka font. Back propagation was used which is a learning method that is usually used by perceptron with many layers to change the weights associated with neurons in the hidden layer. Back propagation method uses its error output to change its weighted value in back-forward. To get this error, the feed-forward stage must be performed. The Scanned images are transformed into grayscale images the process will detect the edge of each character that exists using the Canny algorithm. After getting the edges, the process of threshold the image were turned into black and white form and thickening the edges of the image was done to reinforce the shape of the image that has been threshold. Image data used are 200 pieces for the training process and 100 for the testing process. The details are for training each of the 10 characters, while for testing as many as 5 each. So the total data used is 300 images. From the training results obtained by running the system as much as 6 times obtained from the total number of outputs that match the system or have reached the target of 179 characters. Testing data is taken from several people with different handwriting. In this system the iteration is 100000 that means the training process stops at iteration 100000 with an error target close to 0 for each character trained. From the test results it was shown that there are some images that cannot reach the target. The image did not reach the target because there are similar values between the script pattern and each other. There are some images that have a shape that is somewhat similar so that the characters are not recognized. The trial results obtained an average accuracy of 74%. Tianwei Wang et al. [9] introduced Radical aggregation network for few-shot offline handwritten Chinese character recognition, a novel radical aggregation network (RAN) is proposed for few-shot/zero-shot offline handwritten Chinese character recognition. The RAN comprises of three segments, an extreme mapping encoder (RME), an extreme conglomeration module (RAM), and a character investigation decoder (CAD). Zero-shot refers to the classifier which is trained with limited Chinese character classes, containing all radicals; and the recognition is then performed on the unseen classes. Few-shot means that few support samples of the unseen classes are also added for training. Compared to the large-scale and ever-increasing characters in Chinese languages, approximately 1000 radicals can be used to compose over 10,000 characters. All of Chinese characters can be decomposed into a unique radical string. Regardless of the enormous number of Chinese characters, current best in class techniques accomplished radical-based Chinese character zero-shot acknowledgment with encoder-decoder engineering. The radical aggregation module (RAM) conducts a distance metric between the radical representation and radical prototypes; it then aggregates radicals to its own prototypes while distancing it from others. The character analysis decoder (CAD) analyses the radical representations sequentially and transcripts them into character. Compared to handwritten Chinese character samples, printed Chinese character samples are much easier to obtain. The new methodology RAN were used , which introduces a distance metric criterion for radical features to improve its robustness, and integrates an end-to-end decoding strategy with few support samples used for training and recognized good results with minimum error rate.
Ashlin Deepa et al. [10] proposed A novel nearest interest point classifier for Tamil handwritten character recognition, a study made on image to image matching is included through feature analysis without using machine learning approaches. The main concept of proposed S. Suriya, Dhivya S and Balaji M method is making local-level decision on the class of the test image based on individual features called IPs of the training images. The global decision on the class of the test image is obtained after processing these local decisions. NIP threshold value were found to be greater than the probability, the matching pair of IPs and the corresponding distances with training images were also found using these values. This process gives local decision votes for NIPs. The classifier which classifies the IPs based on NIP is termed as NIP classifier. The proposed NIP classifier was compared with several classifiers by providing standard dataset. The computational complexity of k-means clustering is O(n*k*I*d) where n is the number of samples, k is the number of clusters, I is the number of iterations and d is the number of features. The procedure was repeated till the classification error decreased to some quantity, leading to an increase in computation complexity. A set of groups of character classes are produced as the output of stage. The proposed NIP classifier introduces the concept of local feature decision in the phase, and later global decision is taken to make a conclusion on the class of the test character image. Without compromising on the recognition accuracy, the proposed method simplifies the use of variable length as well as high dimensional feature vector which are measured as major issues of HCR systems. Benchmark database is used to demonstrate robust recognition performance. The state-of-the-art performance is achieved as NIP classifier calculates collective class similarity voting and thus reducing false recognitions. In contrast from conventional classification approaches, which depend on feature reduction methods, the result presented in this paper proves that the proposed classifier can successfully operate on the greater availability of features in the problems with high dimensional images.
Nibaran Das et al. [11] introduced Multiobjective optimization for recognition of isolated handwritten Indic scripts, the efficient region sampling for identifying the most informative local region were identified. The local regions are ranked according to their contribution to the recognition accuracy on the cross-validation dataset. These rankings are used as a guiding factor for our algorithm. The contribution of a local region is determined by computing the negative of the recognition accuracy of SVM classifier on the cross validation dataset, ignoring the features contributed and taking all other features into consideration. The ranks of all the local regions are then determined by arranging them in descending order of their contributions, such that the earlier ranking local regions are more informative than the lower ranking local regions. The algorithm is initialized with an empty harmony memory that will contain the final pareto-optimal solution upon termination. The exploration (HMCR), exploitation (PAR) and opposition-based learning (jump rate or JR) parameters are self-adjusting parameters, that tune themselves with the passage of every generation for a given harmony memory size. If the number of members in the current population is less than the size of the harmony memory, random local regions that are not already present in the harmony memory are added, and their costs are computed .Recognition accuracy has been measured using SVM classifier. Recognition time is the average time taken to extract the features from the images. Redundancy is a measure of how many similar looking local regions are being computed for the images. The performance of the algorithm was tested on four different datasets, a dataset of isolated handwritten basic Bangla characters, a dataset of handwritten Bangla numerals, a dataset of handwritten English numerals. The system performed a trade-off between recognition cost, recognition accuracy, and redundancy.
Rumman Rashid Chowdhury et al. [12] proposed Bangla Handwritten Character Recognition utilizing Convolutional Neural Network with Data Augmentation. The dataset utilized in this trial is the BanglaLekha-Isolated dataset [1]. Utilizing Convolutional Neural Network, this model accomplishes 91.81% exactness on the letter sets (50 character classes) on the base dataset, and in the wake of growing the quantity of pictures to 200,000 utilizing information enlargement. The model was facilitated on a web server for the simplicity of testing and collaboration with the model. The framework was actualized on Google Colab, which is a cloud based web interface to run AI tests, and is accessible for AI scientists for nothing. The framework has 12 Gigabytes of memory, Intel Xeon CPU running at 2.20GHz clock speed and a vigorous GPU (Nvidia Tesla K80). It gives access to an online python journal which goes about as the UI. To give a graphical interface to the prepared model, the Flask library of Python was used, which gives the backend of a web server where the model will be facilitated. For the frontend, HTML, CSS and Javascriptwas utilized, where the client can collaborate with a canvas and compose Bangla letters in order. The server, facilitated at localhost:5000 by Flask, gets a picture input and resizes it to 50x50 which the model expects, at that point it experiences the prepared model and predicts the likelihood of each class. The one with the most extreme likelihood is chosen to be the letter which was drawn. The outcome was printed out to the site page. . It was additionally seen that utilizing a bigger measure of information with variety can assist the model with learning the highlights or qualities of the classes all the more viably. The web interface additionally furnishes a simple method to communicate with the model and perform ongoing approval.
AdeelYousaf et al. [13] proposed Size Invariant Handwritten Character Recognition using Single Layer FeedforwardBackpropagation Neural Networks. a recognition system based on neural network that follows offline handwritten characters has been proposed for Latin digits and alphabets. Each of the characters that are extracted through query image is then resized dynamically to 60x40 pixels' size and is then passed to the neural networks for the process of recognition. Dynamic resizing enables size invariance in the proposed system and also maintains the aspect ratio of the character so that the image is not distorted during resizing. Neural systems are prepared with 19,422 English letters in order's example and 7,720 digits' examples that are composed through 150 distinct journalists in different styles of penmanship. the proposed calculation takes character picture as a sort of question and afterward utilizes one element like force and delivers results that are seen progressively perfect to calculations that utilize various highlights for the procedure of characterization.The query image acts as a scanned image on paper of A4 size that involves the handwritten text in English. However, the proposed algorithm can be extended to any of the font style or to any language, since it is seen dependent on the values of pixel and it is not seen based over the particular features of font or language. Once text objects are obtained and prepared for recognition using image processing techniques explained in the previous section, the CCs are transferred to neural network for the process of recognition. The design of two different types of one layered neural networks have been made through using the feed forward back-propagation. The training of neural network that is used for recognizing alphabets has been done using the sample size of 13,596 of various alphabets that have been written in various styles by around 150 writers. The input that has been provided to the neural network is in the form of vector of around 2400 length and it is resized version of 60x40px image of one alphabet. High recognition rates have been achieved even without feature extraction. Dynamic resizing has been employed to maintain the quality of connected components extracted from a query image while resizing them to the standard size. The proposed system successfully segments out the handwritten text characters from a query image and achieves a precision of 95.69%. Sampath et al. [14] proposed Handwritten optical character acknowledgment by mixture neural system preparing calculation, proposed a crossover neural system preparing calculation for English transcribed OCR. At first, the commotion in the info picture is evacuated utilizing the middle channel, and the picture is resized. At that point, the capabilities, positional, and auxiliary descriptors are removed from the information picture. When the capabilities are extricated, the proposed FLM based neural system distinguishes the manually written character. The FLM proposed by consolidating the Firefly and the Levenberg-Marquardt (LM) calculation for preparing the neural system. At long last, the proposed FLM-based neural system is incorporated inside the feed forward neural system. , a cross breed neural system preparing calculation for written by hand optical character acknowledgment framework is proposed for arranging and perceiving the 62 characters, for example, 26 capitalized English letters in order, 26 lowercase English letters in order and zero to nine digits. From the outset, the pixel esteems are gotten from the resized characters. At that point, the highlights sets are removed from the picture utilizing the proposed descriptors of both H-descriptors and G-descriptors. At that point, the extricated include set is applied to the proposed arrange preparing calculation for perceiving the character. In this paper, the FLM based neural system preparing calculation is proposed, which has been successfully formulated by joining the firefly and Levenberg-Marquardt calculation for the preparation procedure of the neural system. At long last, the proposed half breed and 90% precision was gotten. Dhurgham Ali Mohammed et al. [15] proposed Off-line manually written character acknowledgment utilizing a coordinated DBSCAN-ANN plot, proposed a built up a novel technique for transcribed Arabic characters by joining the Density-Based Clustering strategy with factual and morphological highlights. The main stage in acknowledgment of manually written character picture has been finished by binarization the picture at that point applies commotion expulsion systems. The Density-Based Algorithm used to sort and discover any state of groups dependent on pixel data positions. This method separated the picture into characters. Each character will break down into four locales from the centroid followed by highlight extraction. These highlights incorporate vertical and even projections, upper and lower profile, rectangularity and direction. The aftereffects of the current procedure will move to the Neural Network (NN) stage which produces a significant level of rightness and exactness via preparing. The testing results contrasted and two of condition of-craftsmanship looks into. The proposed Arabic word acknowledgment framework is outfitted towards the cutting edge disconnected content system techniques. The written by hand character pictures IFN-ENT dataset is utilized to cover explicit states of Arabic characters. It comprises of in excess of 2900 different characters with Bitmap picture type. The procedure begins with binarization of word picture followed by division of the chose word into letter fragments. DBSCAN can sort and discover any state of bunches dependent on pixel data places that untruth near one another in Arabic character. A coordinated DBSCAN-ANN plot has been created dependent on character highlights extraction and character acknowledgment. The neurons of information portrayal can be dictated by highlight vector length. Additionally, the information characters considered 168 components dependent on 28 neurons as a yield layer. The procedures recognized the characters dependent on two layer log-sigmoid exchange work which considered as ideal for learning. The capacity creates yield go somewhere in the range of 0 and 1. Additionally, the system date arbitrarily isolated into two classifications. The first is for tanning which is viewed as 80% of the information and the second is 20% which is utilized for testing the framework. Back engendering preparing strategy is utilized dependent on guideline of slope drop. Vijaya Kumar et al. [16]  Adam, a calculation for first-request angle based improvement of stochastic target capacities, in light of versatile assessments of lower-request minutes. The strategy is clear to actualize, is computationally proficient, has little memory necessities, is invariant to slanting rescaling of the slopes, and is appropriate for issues that are enormous as far as information and additionally parameters.The feed-forward Neural Network input layer contains n-dimensional vector as contribution to the system and contains L-1 shrouded layers as center layers for the most part two concealed layers are utilized and may increment dependent on the prerequisite. At long last there is one yield layer containing k number of yield classes. Every neuron in the shrouded layer and yield layer can be part into two sections: preactivation and activation.In a large portion of FFNN will have two concealed layers with 16 or 32 neurons and the sky is the limit from there, Hidden layers are increased with various arbitrary load of picture pixel information which is between 0 to 1. Be that as it may, in Deep Feed forward Neural Network was plan with same two shrouded layers and each concealed layers comprises of enormous arrangement of neurons for example we utilized 512 neurons are taken and this are duplicated with irregular weights.These strategies are train and test on a standard client characterize dataset which is gather from various clients. From exploratory outcomes, it is seen that DFFNN, CCN-Adam and CNN-RMSprop yield the best exactness for Handwritten Hindi characters contrasted with the elective strategies.BakiKoyuncu et al. [17] presented Handwritten Character Recognition by utilizing Convolutional Deep Neural Network, is audited to perceive the written by hand characters in this investigation. Numerous scientists have created frameworks for transcribed character acknowledgment. A few significant frameworks are referenced in this work. Character acknowledgment systems have been designed using diverse reason. The system created by certain inquires about can be built by utilizing equipment with huge scope joining hardware (VLSI). The information character acknowledgment of this structure is impervious to dynamic movement. Different explores used hamming mistake revising codes from correspondence hypothesis with neural system framework in their structure. Another procedure was created to recognize the composed hand characters in various vernaculars in its Neural System. These structures created exact outcomes yet in addition committed errors if the composed hand characters are in extraordinary organization. One of the scientists has even offered a procedure to relate the reliance between hand essayists and their handwriting. These investigations have for the most part used the Multi-layer feed forward neural system framework in their methods.In this examination, Modified National Institute of Standards and Technology (MNIST) database distributed by US division of trade is sent. This database contains countless pictures of composed hand characters. Lessening the size of the photos decreases the general time taken to set up the neural system framework to work. A convolutional neural framework is investigated for perusers' consideration. This framework is very unique for perceivingwritten hand characters. This work relies upon the gathering of characters at the contribution of (CNN). Appear differently in relation to other profound learning designs, CNN has ideal execution in the two pictures and enormous information. The mean to utilize profound learning was to take focal points of the intensity of CNN that can oversee enormous components of information and offer their loads. Soman et al. [18] proposed On creating manually written character picture database for Malayalam language content. The target of this paper is to construct a written by hand character picture database for Malayalam language script.The one of a kind orthographic portrayal of the Malayalam characters frames the distinctive character classes, and the present variant of the database contains 85 character classes every now and again utilized recorded as a hard copy Malayalam content. Dissipating convolutional organize based highlights could accomplish 91.05% acknowledgment exactness among thought about strategies. Liang Xu et al. [19] presented Recognition of Handwritten Chinese Characters Based on Concept Learning. Idea learning is a hominine learning approach. Dissimilar to existing profound learning models, calculated model learning can be acknowledged by utilizing as meager as one example. This paper is the first to propose a written by hand Chinese character acknowledgment technique dependent on idea learning. Unique in relation to the current picture portrayal based character acknowledgment strategies, the proposed strategy assembles a meta stroke library with earlier information, and afterward, presents a Chinese character applied model dependent on stroke relationship getting the hang of utilizing a character stroke extraction technique and Bayesian program learning. During character acknowledgment, Monte Carlo Markov bind testing is used to get the character age model for each character calculated. This age model can ascertain the likelihood of the objective and preparing characters being a similar characterization, and accordingly decides the arrangement of the objective character. In the idea learning-based transcribed Chinese character acknowledgment strategy, the strokes developing a specific character are separated utilizing the previously mentioned character strokeextraction method. Chinese character calculated model was worked by utilizing character stroke extraction and Bayesian program learning, and a character age model for each character theoretical model worked by utilizing Monte Carlo Markov Chain examining during the character acknowledgment. The trial results show that the proposed technique can prepare the reasonable model for character characterization expectation utilizing as not many as one character test. Ahmed TalatSahloler al. [20] presented Handwritten Arabic Optical Character Recognition Approach Based on Hybrid Whale Optimization Algorithm With Neighborhood Rough Set. a half and half AI approach that uses neighborhood harsh sets with a twofold whale advancement calculation to choose the most suitable highlights for the acknowledgment of written by hand Arabic characters. To approve the proposed approach, we utilized the CENPARMI dataset, which is a notable dataset for AI tests including written by hand Arabic characters. The outcomes show away from of the proposed approach as far as acknowledgment precision, memory impression, and processor time than those without the highlights of the proposed strategy. When looking at the aftereffects of the proposed technique with other ongoing best in class enhancement calculations, the proposed approach beat all others in all investigations. Also, the proposed approach shows the most elevated acknowledgment rate with the littlest utilization time contrasted with profound neural systems, for example, VGGnet, Resnet, Nasnet, Mobilenet, Inception, and Xception. The proposed approach was likewise contrasted and as of late distributed works utilizing the equivalent dataset, which further affirmed the remarkable grouping precision and time utilization of this methodology. The proposed approach comprises of four phases. The first is preprocessing of the dataset, which intends to expel commotion and clean the information. The second is highlight extraction, which expects to remove highlights from the information, for example, slope highlights, vertical and even projection highlights, vertical/level/askew projection highlights, and different highlights. The significant third stage is highlight determination, which is viewed as the fundamental commitment of this paper. In this stage, the component choice methodology begins by creating an arbitrary populace that speaks to a lot of arrangements. At that point, every arrangement is changed over into a paired form, where the highlights that relate to 1's are viewed as significant highlights, while different highlights are disregarded. From that point, the nature of the chose highlight (in view of the present arrangement) is assessed through registering the goal work. The principle reason for this paper is to fabricate a cross breed approach that can choose adequate highlights that improve the exhibition of written by hand Arabic characters in the littlest measure of time with a low memory impression. The outcomes show that the BWOA-NRS approach can choose the most fitting highlights, which evidently improves the characterization execution. The outcomes were contrasted with the latest component determination calculations, for example, ABC, SCA, GWA, ALO, and SSA, it very well may be seen that the BWOA-NRS calculation beats different methodologies dependent on swarm strategies.

Existing System
Handwritten character recognition is a difficult problem due to the great variations of writing styles, different size and orientation angle of the characters. Among different branches of handwritten character recognition it is easier to recognize English alphabets and numerals than Tamil characters. Early techniques in handwritten character recognition failed, notably in the presence of blur, low contrast, low resolution, high image noise, and other distortions. In order to avoid the distortions Convolutional Neural Network algorithm has been used to recognize and classify the image.

Figure 1. Comparison of Existing Algorithms
The purpose of this project is to take handwritten Tamil characters as input, process the character, train the neural network algorithm, to recognize the pattern from the word and modify the character to a beautified version of the input. This project is aimed at developing a model which will be helpful in recognizing characters of Tamil language from the word. To recognize and classify the character image even if the image is blur or of any distortions.

Proposed System
Character Recognition is the acknowledgment of printed or composed content characters by a PC. This includes photographs examining the content character-by-character, the examination of the checked-in picture, and afterward interpretation of the character picture into character codes, for example, ASCII, ordinarily utilized in the information preparing.

Figure 2. Character Sample at Various Levels
Handwritten character recognition is a difficult problem due to the great variations of writing styles, different size and orientation angle of the characters. Among different branches of handwritten character recognition it is easier to recognize English alphabets and numerals than Tamil characters.

Figure 3. Training and Validation of Proposed Model
The proposed model consists of two parts: Training Part Recognition Part.  Training Part -Training part involves data pre-processing, building the network architecture and training the network with the preprocessed data.  Recognition Part -Recognition part involves recognizing the character using the trained model. The point of pre-preparing is an improvement of the picture information that stifles undesirable twists or upgrades some picture highlights significant for additional handling. Different groupings of picture pre-processing strategies exist. Picture pre-handling techniques utilize impressive excess in pictures. Neighboring pixels comparing to one item in genuine pictures have basically the equivalent or comparative splendor esteem.
In this manner, the misshaped pixels can frequently be reestablished as a normal benefit of neighboring pixels. The idea of clamor (as a rule its ghastly attributes) is in some cases known as information about items that are scanned for in the picture, which may streamline the preprocessing impressively. Pixel Brightness Transformations are brightness changes alter pixel splendor that change relies upon the properties of a pixel itself. Splendor rectifications and Grayscale changes, Brightness revision considers a unique splendor pixel position in the picture. Grayscale changes change the splendor regardless of position in the picture. Position subordinate brilliance amendment, the affectability of picture obtaining and digitization gadgets ought not to rely upon the situation in the picture, the planar change has been cultivated, and new point organizes (x,y) were acquired. The situation of the point doesn't, all in all, fit the discrete raster of the yield picture. Qualities on the whole number lattice are required.
Every pixel in the yield picture raster can be gotten by splendor addition of some neighboring no integer tests. The splendor interjection issue is normally communicated in a double manner (by deciding the brilliance of the first point in the info picture that compares to the point in the yield picture lying on the discrete raster). Figuring the brilliance estimation of the pixel (x,y) in the yield picture where x and y lie on the discrete raster. The dataset contains 82,929 images in tiff or png format. These images are obtained from the online version using simple piece-wise linear interpolation and a constant thickening factor. The images are bi-level images with background being white (255) and the foreground in black (0).
The images are of varying sizes which were size normalized to 64 64 using bilinear interpolation technique and scaled to 0, 1 range. We performed training on two set of inputs, one with the original images and another with inverted images (foreground as 1 and background as 255). Preprocessing is the first step, once the datasets are preprocessed and after converting the raw data into cleaned data they are normalized as they are of varying sizes.

Figure 4. Preprocessing of the model
Normalization or standardization is a procedure that changes the scope of pixel force esteems. Applications incorporate photos with poor difference because of glare, for instance. Standardization is now and then called differentiate extending or histogram extending. In increasingly broad fields of information handling, for example, computerized signal preparing, it is alluded to as unique range expansion. The reason for dynamic range development in the different applications is generally to bring the picture, or other sort of sign, into a range that is progressively recognizable or ordinary to the faculties, henceforth the term standardization.
Assume dataset X, which has N rows(entries) and D columns(features). X[:,i] represent feature i and X[j,:] represent entry j.
This transformation sets the mean of data to 0 and the standard deviation to 1. In most cases, standardization is used feature-wise.
This method rescales the range of the data to [0,1]. In most cases, standardization is used feature-wise as well. its normal purpose is to convert an input image into a range of pixel values that are more familiar or normal to the senses, hence the term normalization.  Figure 6. After Normalization Data Augmentation is a system that empowers specialists to altogether build a decent variety of information accessible for preparing models, without really gathering new information. Information enlargement methods, for example, trimming, cushioning, and even flipping are regularly used to prepare enormous neural systems. Profound Learning in some cases may run into an issue where information has a restricted size. To show signs of improvement speculation the model needs more information and as much variety conceivable in the information. Now and again, the dataset isn't sufficiently large to catch enough variety; in such cases producing more information from the given dataset is finished. That is the place Data Augmentation assumes a significant job. Information growth implies expanding the measure of preparing information utilizing data accessible from the preparation information. It's an assortment of procedures for "improving" preparing information. Regularly used to mentor the models to overlook superfluous varieties in the information. For instance, if the preparation of a picture classifier, needs to take care of the model initially and flipped/pivoted forms of each preparation picture (not such a smart thought on the off chance that you were preparing an OCR model!). Likewise basic yet astute: "arbitrary eradication", in which a little square shape of a preparation picture is haphazardly darkened, to attempt to make a model powerful to impediments. A generative ill-disposed system was utilized to change manufactured preparing pictures to make them look like genuine photos (the preparation task was to decide the look heading of a human subject). Information increase is utilized here and there to make the model increasingly powerful to over-fitting. Now and again, pictures, essentially that is taken from the preparation set and change it (pivot, flip, shading variety, clamor,.. While in crude strategies channels are hand-designed, with enough preparation, ConvNets can gain proficiency with these channels/qualities. The engineering of a ConvNet is undifferentiated from that of the availability example of Neurons in the Human Brain and was roused by the association of the Visual Cortex. Singular neurons react to upgrades just in a limited area of the visual field known as the Receptive Field. An assortment of such fields covers to cover the whole visual territory. Every neuron gets a few data sources, plays out a speck item, and alternatively tails it with a non-linearity. The entire system despite everything communicates a solitary differentiable score work: from the crude picture pixels toward one side to class scores at the other. They despise everything to have a loss work (for example SVM/Softmax) on the last (completely associated) layer. Input Layer -The input layer of a neural system is made out of counterfeit info neurons and carries the underlying information into the framework for additional handling by ensuing layers of fake neurons. The info layer is the earliest reference point of the work process for the fake neural system. Convolution Layer -The convolutional layer is the center structure square of a CNN. The layer's parameters comprise of a lot of learnable channels (or portions), which have a little open field, however, reach out through the full profundity of the info volume. Pooling Layer -A pooling layer is another structure square of a CNN. Its capacity is to logically diminish the spatial size of the portrayal to lessen the measure of parameters and calculation in the system. The pooling layer works on each element map autonomously. Pooling layers are utilized to decrease the components of the element maps. In this way, it diminishes the number of parameters to learn and the measure of calculation acted in the system. The pooling layer condenses the highlights present in a district of the element map created by a convolution layer. In this way, further activities are performed on abridged highlights rather than correctly situated highlights produced by the convolution layer. This makes the model progressively vigorous to varieties in the situation of the highlights in the information picture. In this way, a nh x nw x nc include map is diminished to 1 x 1 x nc highlight map. This is equal to utilizing a channel of measurements nh x nw, for example, the components of the element map. Further, it tends to be either worldwide max pooling or worldwide normal pooling. A Fully associated layer is the genuine segment that does the discriminative learning in a Deep Neural Network. It's a straightforward Multi-layer perceptron that can learn loads that Diverse actuation capacities have been utilized across different structures of convolution neural systems. Nonlinear actuation capacities, for example, ReLU, LReLU, PReLU, and Swish have demonstrated wagered results when contrasted with the great sigmoid or digression functions. These nonlinear capacities have helped in accelerating the preparation. In this work, we have attempted distinctive enactment capacities and seen ReLU as more successful than others. A Convolutional Neural Network (CNN, or ConvNet) is a unique sort of multi-layer neural systems, intended to perceive visual examples legitimately from pixel pictures with negligible preprocessing. The ImageNet venture is a huge visual database intended for use in visual item acknowledgment programming research. The ImageNet venture runs a yearly programming challenge, the ImageNet Large Scale Visual Recognition Challenge (ILSVRC), where programming programs contend to accurately characterize and recognize articles and scenes. Figure 9. CNN Models used in Proposed Wok 6This system is portrayed by its effortlessness, utilizing just 3×3 convolutional layers stacked on one another in expanding profundity. Decreasing volume size is dealt with by max pooling. Two completely associated layers, each with 4,096 hubs are then trailed by a Softmax classifier (above). The "16" and "19" represent the number of weight layers in the system. The VGG organize engineering was presented by Simonyan and Zisserman. Simonyan and Zisserman discovered preparing VGG16 and VGG19 testing (explicitly in regards to the assembly on the more profound systems), so as to make preparing simpler, they initially prepared littler renditions of VGG with fewer weight layers. The littler systems combined and were then utilized as statements for the bigger, more profound systems this procedure is called pre-preparing. While seeming well and good, pre-preparing is a very tedious, dull assignment, requiring a whole system to be prepared before it can fill in as an introduction for a more profound system. The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) is a yearly PC vision rivalry. Every year, groups contend on two assignments. The first is to recognize questions inside a picture originating from 200 classes, which is called object restriction. The second is to group pictures, each marked with one of 1000 classifications, which is called picture order. The contribution to the system is picture of measurements (224, 224, 3). The initial two layers have 64 channels of 3*3 channel size and same cushioning. At that point after a maximum pool layer of step (2, 2), two layers which have convolution layers of 256 channel size and channel size (3,3). This followed by a maximum pooling layer of step (2,2) which is same as past layer. At that point there are 2 convolution layers of channel size (3,3) and 256 channel. After that there are 2 arrangements of 3 convolution layer and a maximum pool layer. Each have 512 channels of (3, 3) size with same cushioning. This picture is then passed to the heap of two convolution layers. In these convolution and max pooling layers, the channels we use is of the size 3*3 rather than 11*11 in AlexNet and 7*7 in ZF-Net. In a portion of the layers, it additionally utilizes 1*1 pixel which is utilized to control the quantity of info channels.

Experimental Results
It is the sensible portrayal of the ResNet.At last, at the ILSVRC 2015, the alleged Residual Neural Network (ResNet) by Kaiming He et al presented a novel design with "skip associations" and highlights substantial group standardization. Such skip associations are otherwise called gated units or gated intermittent units and have a solid comparability to ongoing fruitful components applied in RNNs. On account of this system they had the option to prepare a NN with 152 layers while as yet having lower multifaceted nature than VGGNet. It accomplishes a best 5 mistake pace of 3.57% which beats human-level execution on this dataset. ResNet was at first structured as a technique to take care of the disappearing inclination issue. This is where backpropagated angles become incredibly little as they're increased again and again, restricting the size of a neural system. The ResNet engineering endeavors to comprehend that by utilizing skip associations, that is adding alternate ways that permit information to skirt past layers. The model comprises of a progression of convolutional layers + skip associations, at that point normal pooling, at that point a yield completely associated (thick) layer. For move learning, we just need the convolutional layers as those to contain the highlights we're keen on, so we would need to overlook them when bringing in the model. The inclination to include such a large number of layers by profound learning experts is to remove significant highlights from complex pictures. Along these lines, the primary layers may identify edges, and the ensuing layers toward the end may distinguish unmistakable shapes, similar to feels worn out on a vehicle. However, on the off chance that we add in excess of 30 layers to the system, at that point its presentation endures and it accomplishes a low exactness. This is in opposition to the reasoning that the expansion of layers will improve a neural system. This isn't expected to overfitting, in light of the fact that all things considered, one may utilize dropout and regularization systems to unravel the issue out and out. It's chiefly present on account of the mainstream disappearing slope issue. The ResNet152 model with 152 layers won the ILSVRC Imagenet 2015 test while having lesser parameters than the VGG19 arrange, which was extremely mainstream around then. A remaining system comprises of lingering units or squares which have skip associations, likewise called personality associations.

Figure 11. Layers of RESNet
A leftover square has a 3 x 3 convolution layer followed by a cluster standardization layer and a ReLU actuation work. This is again proceeded by a 3 x 3 convolution layer and a group standardization layer. The skip association essentially avoids both these layers and includes straightforwardly before the ReLU actuation work. Such leftover squares are rehashed to frame a remaining system.

Figure 12. Block of RESNet
The yield of the past layer is added to the yield of the layer after it in the leftover square. The bounce or skip could be 1, 2 or even 3. While including, the elements of x might be not quite the same as F(x) because of the convolution procedure, bringing about a decrease of its measurements. In this manner, we include an extra 1 x 1 convolution layer to change the elements of x. Another regularization system called Scheduled Drop Path is likewise proposed which essentially improves the speculation in the models. Finally, this model accomplishes cutting edge results with littler model size and lower unpredictability (FLOPs). In this system, however the general design is predefined as appeared over, the squares or cells are not predefined by creators. Rather, they are looked by fortification learning search strategy. for example the quantity of theme redundancies N and the quantity of introductory convolutional channels are as free parameters, and utilized for scaling. In particular, these phones are called Normal Cell and Reduction Cell. Ordinary Cell: Convolutional cells that arrival an element guide of a similar measurement. Decrease Cell: Convolutional cells that arrival an element map where the element map stature and width is diminished by a factor of two. Just the structures of (or inside) the Normal and Reduction Cells are looked by the controller RNN (Recurrent Neural Network). The undertaking of article confinement is to foresee the item in a picture just as its limits. The contrast between object restriction and item location is unpretentious. Just, object restriction plans to find the principle (or generally noticeable) object in a picture while object discovery attempts to discover all the articles and their limits. Item discovery can be performed utilizing a system called "sliding window recognition". We train a ConvNet to recognize questions inside a picture and use windows of various sizes that we slide on it. For every window, we play out a forecast. Its huge drawback is the computational cost, which is extremely broad since we can have a great deal of windows. The answer for that is the sliding window recognition registered convolutionally. Particular quest calculation is utilized for object acknowledgment.
Work process  Split a picture into S×S cells. In the event that an article's inside falls into a cell, that cell is "mindful" for distinguishing the presence of that object. Every cell predicts (a) the area of B jumping boxes, (b) a certainty score, and (c) a likelihood of article class molded on the presence of an item in the bouncing box.
 The directions of bouncing box are characterized by a tuple of 4 qualities, (focus x-coord, focus ycoord, width, stature) -(x,y,w,h), where x and y are set to be balanced of a cell area. In addition, x, y, w and h are standardized by the picture width and stature, and in this manner all between (0, 1].  A certainty score demonstrates the probability that the cell contains an item: Pr(containing an article) x IoU(pred, truth); where Pr = likelihood and IoU = connection under association.
 If the cell contains an item, it predicts a likelihood of this article having a place with each class Ci,i=1,… ,K: Pr(the object has a place with the class C_i | containing an item). At this stage, the model just predicts one lot of class probabilities per cell, paying little mind to the quantity of bouncing boxes, B.
 In absolute, one picture contains S×S×B jumping boxes, each container comparing to 4 area forecasts, 1 certainty score, and K contingent probabilities for object characterization. The complete expectation esteems for one picture is S×S×(5B+K), which is the tensor state of the last conv layer of the model.  The last layer of the pre-prepared CNN is adjusted to yield an expectation tensor of size S×S×(5B+K).

Figure 14. Sample Localization of Datasets
The loss comprises of two sections, the restriction loss for jumping box counterbalance expectation and the order loss for contingent class probabilities. The two sections are registered as the aggregate of squared mistakes. Two scale parameters are utilized to control the amount we need to build the loss from jumping box organize expectations (λcoord) and the amount we need to diminish the loss of certainty score forecasts for boxes without objects (λnoobj). Down-weighting the loss contributed by foundation boxes is significant as the majority of the jumping boxes include no occurrence. In the paper, the model sets λcoord=5 and λnoobj=0.5.
The dataset used for evaluation is the Tamil Character Classification from HP Labs India. The dataset consists of 85000 images of Tamil Characters. Each letter consists of approx. 300 images. Each of images are of different shapes.The models that have been analyzed are tested using Cross Entropy. Cross entropy is used to measure the performance of the classical model. It finds the probability of the event drawn. There is no specific analysis measure for the same and the efficiency of the algorithm can be visualized with the task it is entitled to.
The results show that a convolutional neural network is capable of achieving record breaking results on the Tamil dataset. Screenshots   Detecting the features of the labels and segregating it.
The image is recognized in variety of challenging conditions and performance and accuracy is validated using cross entropy.

Conclusion
A lot of research work exists in the survey for Handwritten recognition. However, there is standard solution to identify all Tamil characters with reasonable accuracy. Various methods have been used in each phase of the recognition process, Challenges still prevails in the recognition of normal as well as abnormal writing, slanting characters, similar shaped characters, joined characters, curves and so on during recognition process. In this project, I have projected various aspects of each phase of the offline Tamil character recognition process. I have used maximum all character set. Coverage is not given for different writing styles and font size issues. So far I have trained the datasets with four models of CNN and character is being recognized from the word. The following key challenges can be further explored in the future. As a result, among the proposed algorithm, it has been found that different CNN models produces different result and is better which yields the highest recognition accuracy. The handwritten recognition system described in this project will find potential applications in handwritten character recognition from words. The proposed architecture has shown enhanced performance in recognizing the character.