Consumption of Licit and Illicit Substances leading to Mental Illness: A Prevalence Study

Background: A menace case of drug & narcotics abuse has been in prime focus of the society nowadays. Therefore, the need of technological intervention is primary concern to examine the prevalence, severity and outcome to the drug menace and its consequences. Objective: This study is to suffice clinical decisions through behaviour observatory data through preliminary screening of prevalence, correlation and severity of illness. Method: The model has been proposed to check for General Anxiety Disorder and Depression of a subject abusing any of the drug/marijuana/alcohol. In this model data set of Sikkim’s youth has been considered to find relation of addiction leading to mental disorder. Result: This proposed system has been successful to associate any form of substance abuse to to some of illness to a limit of .83 accuracy scored by Support Vector Machine over the other machine learning models. The model has been deployed and being observed in few of the rehabilitation centre.


Introduction
It is an important aspect of healthy living at every stage of human life cycle, from childhood, through adolescence to adulthood and finally to old age [1]. However, biological factors (genetic, biochemical, etc.), life experiences (trauma, abuse etc.), social factors (cultural, family, society etc.), psychological factors (mood, personality, behavior etc.) would alter the behaviour of an individual which in turn causes ill health . In 2018, US Department of Health has provided a National Survey on Drug Use and Health states that the youngster consuming licit or illicit substance has been often a patient of depression & other mental illness [2]. It further states that the person consuming illicit and licit substance also suffers from such changes on physical and psychological behaviour. In order to establish the fact that what is the motivation and prevalence of consumption, various work has been done to study the psychological and behavioral patterns of the person who is mentally unwell for their treatment and further prevention. As [3] studied sensitivity, [4] studied impulsivity for the association of psychological construct with illicit and licit substance consumption. Similarly, [5] tested behavioral inhibition system and sensational seeking and [6] reported temporal discounting on the consumption of such substances. [7] used questionnaire to build Machine Learning(ML) models and [8] used social mining to understand the pattern of substance intake. The number of deaths and vulnerable accidents that takes place annually due to substance abuse and alcoholism is at high rate and increasing every year. According to Indian National survey [9] by the National Drug Dependence Treatment Centre (NDDTC), about 14.6% of the population uses alcohol, 2.8 % uses cannabis, 0.96% uses pharmaceutical opioid (0.96%) and 0.52% uses Opium. Looking at the data mentioned above which is increasing in a quite alarming rate, the identification of people consuming such life threatening substances has been ongoing challenge for the government as well as for research group. In addition to clinical & laboratory practices, there are various psychological and empirical means to understand the consumption pattern and its associated morbidity. However, due to varied range of disorder, heterogeneous in its cause, co-morbidity of diseases, social stigmatization of such disorder and severity of effects, sometimes led to disadvantages in diagnostics, planning and treatment & prognosis of disease.
In order to facilitate the doctors, counselor, parent and teacher with a patient analyser and referral system, this propose study is to bring forward such cases to accumulate data and conceptualize an ecosystem to map the behavioral changes, clinical studies, selfresponded information etc. with the mental disorders and psychometric parameters in particular case of licit or illicit substance abuse in the state of Sikkim. In Section 2, the related work and limitations of previous study has been presented. Section 3 introduces research methodology. Section 4 elaborates on the application of proposed approach and result discussion. Section 5 highlights challenges and limitations. Finally, there is discussion on some of recommendations and conclusion with future work in Sections 6 and 7 respectively.

Related Work
A review of studies is performed on the statistical analysis done on the basis of clinical data from the hospital, socio economic survey and society based survey. Some of existing work studied is aligned with technical interventions, social mining, behavioral study [10] and modeling of ML architecture in combination to study treatment, prevalence and relapse of diseases.
Alcohol consumption has traditionally been prevalent among Sikkim's population [11]. State Socio Economic survey has shown its prevalence as 37% and 56% in urban and rural respectively. There are lot of enthusiasm surrounding the use of historical data and ML techniques and other cognitive algorithms to unearth key patterns and interactions in mental health data [12] in association with substance consumption. To cite an example, Schizophrenia is a severe and complex psychiatric disorder that develops in approximately 1% of the world's population [13] which is a cause of various form of intoxication that an individual is subjected to.
Web enabled services like social networking, blogs, messages, personality traits, simple questionnaires based on certain standard like Diagnostic and Statistical Manual of Mental Disorder (DSM) [14], addictive severity gauge [15], face to face interviews and beyond [16] with patient or normal people are used to gain insights and correlate it with illness. Traditional statistical analysis has already been utilized in the healthcare sector for decades to predict certain outcomes [17] in relation to substance intake and illness. Use of data analytic to converge various sourced data into notable dimensions and help making decisions is suggested as in [18,19,20] along with regression analysis.
Artificial Intelligence algorithms application in image analysis [21][22][23][24] has transformed traditional unsustainable health care systems into sustainable with model diagnosis and precision based treatment for various illness. The research work [25][26][27][28] deals with the techniques and models that can be incorporated to conclude mining, clustering and data analysis of the medical records. However, on behavioral spectrum, study [29] was conducted within the hospitals of Sikkim to find that revealed patients seeking emergency services with 1.16 % of population of study being substance abuse, 77.8% are alcohol abusers and remaining as opioid abusers. While study within northeast part of India was done on socio economic parameters [30] to illustrate the fact that the number of male abusers were more than female. The methodologies using ML model [31] has been be used to bisect the data and analyse severity & prevalence of illicit or licit substance abuse [32,33]. The study discussed above was also supported by a work done in [34,35] to see in-depth correlation of illness with intake of illicit drugs w r t Sikkim and northeastern states. Similarly, prevalence among youth in the city of Hongkong for illicit substance use is reviewed by [36,37]. The findings and perception [38]on drinking alcohol, age of prevalence, group of people, region and geography of penetration, type of drug and other has also been recorded.
A work performed [39] on the hospitalized patients have focused on the case of mental illness due to drug abuse and substance intake. It has revealed that psychological correlations of substance abuse cannot be overruled. The investigation was performed as per DSM fifth edition diagnostics using Beck Depression Inventory(BDI) and Eysenck Personality Questionnaire-Revised (EPQ-R), International Classification of Diseases (ICD). The entire study was based on the descriptive, statistical analysis on standard tools like statistical software packages like SPSSS, Predictive Analytic Software etc. For early detection and classification of neural disorder [40], Electroencephalogram(EEG) Signal were also used to classify into dementia, autism, epilepsy using various algorithms. The data set was also gathered through socio, economic, demographic, cognitive, psychotic parameters to evaluate the likelihood of prevalence of alcohol and substance abuse A research work [41] performed data collection from the physical movement of the body to collect temporal information and comparative fit index parameters along with some biological parameters was analyzed. online therapy for treatment and its management [42], used online platform to assist clinicians providing cognitive therapy to the patient with such illness. The further progress on research work where ML models are able to identify the thought markers of suicidal subjects through words and vocal characteristics [43]. A major work in [44][45][46] highlights opinion mining and nature inspired algorithms implemented in health care sectors.
Depression and anxiety disorders are very closely related with substance abuse as studied in [47] using self organizing maps has sufficient proofs to detects the depression clusters formed through the lifestyle environs variables health survey. It has also been explored in [48] that depression is major effect of substance abuse and vice-versa sometimes. The EEG, galvanic skin response, eye tracking movements are used to classify them with accuracy of 75% and f score of 80%. Emotional regulation and symptoms [49] are also used with machine learning to classify active depressive disorders. A research study on the same concepts [50] has associated five personality factors of Neuroticism, Extraversion, Openness, Agreeableness, and Conscientiousness Five Factor Inventory(NEOFFI) to observe opioid dependence sample with the control variables. The high score on neuroticism , low on extraversion and conscientiousness would be desiring patterns for the dependents to intercept and treatment ensured. Another finding [51] has confirmed the psychological parameters like impulsivity, sensation seeking are the key correlates to the behavioral patterns exhibited by the drug dependent and conduct disorder. A pilot study performed by [52] would characterize the human behaviors using NEOFFI model to conclude higher value of neuroticism will open a door for substance intake, [53] however has done a mapping of behavioral economics to the personality traits. The significant difference in psychological profiles of drug user and non drug users also identified on big five factor model NEOFFI of human psychology [54]. The scheme of digital health innovation action plan [55] under university of John Hopkins mentions that on progression of behavioral disorders there is always a high chance of leaving the treatment, unable to reach the healthcare personal on time, relapses mentioned even for high risk and high cost treatments [56] creating a room for early intervention.
Study [57] reveals substance use disorders like depression and anxiety due to alcohol, cannabis or opiates is a complex brain disease and is a major public health problem with an estimated prevalence in India to be around 35-50/1000 population. Patients with substance use disorders form almost one fourth of clinical load which any psychiatrist has to handle in their day to day practice [58]. There are also mention of very little survey and research on regional level with its variability in terms of prevalence, regional bias and substance specific analysis. Among this group of survey, a significant proportion of them have co-morbid psychiatric disorders [59]. A social research study was done in Chandigarh state [60] where Rapid Assessment Survey(RAS) administered to respondent driven sampling found that substance like opioid is highly prevalent in the U.T of Chandigarh. RAS based another survey in Punjab [61] has also administered on 6600 community dwelling substance dependent person to find the hidden insights to find that 88% are opioid dependent and also reported to have high risk behaviour with supported figures from epidemiology of substance use and dependence in the state of Punjab, India [62]. A comprehensive review of some of the major human psychological disorders (stress, depression, autism, anxiety, Attention-deficit hyperactivity disorder, Alzheimer, Parkinson, Insomnia, schizophrenia and mood disorder) mined using different supervised and nature-inspired computing techniques performed [63] based on three-dimensional search space i.e. disease diagnosis, psychological disorders and classification techniques has been discussed. According NDDTC report, about 14.6% of the population uses alcohol nationally. After Alcohol, Cannabis 2.8% of the population (3.1 crore individuals) highest prevalence in Uttar Pradesh, Punjab, Sikkim, Chhattisgarh and Delhi. Nationally, then common opioid used is Heroin (1.14%) followed by pharmaceutical opioid (0.96%) and Opium (0.52%) with state of Sikkim being on the top three again. On the other hand, depression, stress, and happiness were assessed by psychological tests, i.e., the Beck Depression Inventory(BDI), perceived stress scale, and the Oxford happiness inventory [64] with descriptive statistics, t-test to conclude that drug addicts significantly differ from non-drug addicts. Research analysis[65] highlighted an indepth of the rationale of treatment and rehabilitation components of the scheme, its implementation mechanisms, its intervention processes at community level and the efficiency of monitoring systems to reveal the gap of early screening and preemption on substance abuse.
Most of the literature studies review have been found to be explicitly working in various demography and their related studies within India and abroad, but proposed novel kind of study is novel for the state of Sikkim. As national survey also pointed out the figure of prevalence in the state of Sikkim, the researcher wanted to go inline with it and perform preliminary cause effect study. Thus, in the proposed research work, an addiction is studied and leading mental illness in particular to depression and anxiety are correlated. The article also introduces a new point of view based on the machine learning efficacy to identify predictive behavior for appropriate intervention.

Work
Methodology Observation [11]  The study employs both qualitative as well as quantitative methods to collect primary and secondary sources.  The tools for primary data collection are Questionnaires, interviews and the sample size is 100.  Interviews have been conducted with local respondents of the state those who consumed alcohol  correlation with factors like diseases, suicide etc. impact on the impact of study [12]  Unsupervised Self Organising Map(SOM) to create clusters  Supervised boosted regression algorithm to describe clusters.  Ninety-six ''lifestyle-environ'' variables were used from the National health and nutrition examination study.  Nine-item, self-reported Patient Health Questonnaire-9 (PHQ-9) was used to assess depressive symptoms  Multivariate logistic regression validated clusters and controlled for possible socio demo-graphic confounders.
 More of Clinical aspects could also incorporate the learning adaptive to patients with mental illness  Psychotic parameters can be considered to nullify illness due to depression or depression due to illness  Learning models can be tested for better accuracy and precision [13]  Face to face interactions and assessments with clinicians with mental health illness for 21 outpatients diagnosed with schizophrenia.  Aggregated ecological momentary assessment (EMA) scores that measure several dynamic dimensions of mental health and functioning in people  Bivariate regression analysis  Generate person specific models using Random Forest (RF)to gain insight into predicting smoothed EMA sum scores  New invention techniques to automatically alert clinicians  Train accurate personalized models that require fewer individual-specific data to quickly adapt to new users [17]  Epidemiological data were collected by administering pre-devised questionnaires from n = 223 prescription opioid abusers reporting for treatment at five different drug abuse treatment centers across Sikkim  Addiction Severity Index Lite (ASI Lite) was administered to gather information on seven domains of a patient's life, Medical, employment/support, drug and alcohol use, legal, family, social relationship, and psychiatric problems  Fagerstrom Test for Nicotine Dependence, a six-item questionnaire assessed the pattern and severity of tobacco use among prescription opioid abusers  Quality of life questionnaire  The mean and standard deviation (SD)  The Chi squared test was used to test hypotheses between categorical variables. Significance level was set at P < 0.05.  Odds ratio (OR), Relative Risk (RR), and 95% Confidence Interval (CI) were calculated to estimate associated risk.  PASW 18.0 used  This abusers retrospective data analysed, but relapse and follow ups are not done  No proper study on respondents landing up with mental illness and even harming their life till death.
[18]  All the consecutive cases of suicide attempts treated in a general hospital were evaluated for psychosocial, clinical risk factors, suicide characteristics, psychiatric morbidity comorbidity and psychiatric diagnosis by using ICD-10.  Presumptive stressful life event scale was utilized to calculate life events score  self-designed proforma was administered to the subjects relating the factors responsible for the attempts  Community based studies can reveal more significant factors relating to suicide attempts.  Analysis of data acquired from attempted to suicide can be done to find out the root cause.
[20]  Machine learning algorithms were used with the subjects' words and vocal characteristics to classify 379 subjects recruited from two academic medical centers and a rural community hospital into one of three groups: suicidal, mentally ill but not suicidal, or controls  Trait analyses focus on stable characteristics rooted in and state analyses measure dynamic characteristics like verbal and nonverbal communication, termed thought markers  The Suicidal Adolescent Clinical Trial, the single-site which used machine  Based on the subjects' self-reports, which means that  Some patients could have been dis ingenuous  Speech recognition and language processing can be done to identify vulgarity, emotional and emotionless verbal communication.  Nonverbal state of thoughts can be better marked with electronic sensors, smart phones

Work
Methodology Observation learning to analyze interviews with 60 suicidal and control patients, classified patients into suicidal or control groups with greater than 90% accuracy  Subjects were taken from different hospitals in US  Data were collected and validated by trained mental health professionals  Each subject completed standardized tools: Columbia-Suicide Severity Rating Scale, Young Mania Rating Scale and Hamilton Rating Scale for Depression  Linguistic and Acoustic Feature Extraction performed based on vocal and prosodic parameters with correlation and normalization done  Supervised learning Support Vector Machine (SVM) approach was used to some accuracy [29]  A retrospective chart review was used. Patients with history of current drug use seeking emergency services for any medical or surgical consequence incident to substance abuse from July 2000 to June 2005 (60 months) were included in the study.  Data were generated from emergency case register, hospital records and case sheets.  SPSS 10.0 was used for data analysis.  Data Collected through Demographics, Socioeconomics, Drug use variables, High Risk Variables, Reason for Seeking Treatment, Details of Treatment.
 It is an important measure to assess treatment demand from substance abusers and can be an effective tool for a preliminary assessment of magnitude and pattern of substance abuse in the community.  Community at large is also exposed to this abuse but no studies has been done

Work
Methodology Observation  Scaling in terms of substance abuse, delinquency and problem behavior.  Hypothetical models were tested by structural equation modelling (LISREL), Maximum Likelihood for normality  Chi-square , root mean square error of approximation ,goodness-of-fit index, and standardized root mean residual,  Incremental fit measures include the non-normed fit index and the comparative fit index from the amount of data analyzed.
[42]  Technology driven advanced computational and artificial intelligence methods be employed to supplement the support provided by moderators/clinicians  Automate user-tailored therapy using natural language analysis  Can be deployed in greater scale  Machine learning can be used to classify subject with personalized therapy inducted to individual [50]  Examines personality characteristics using the Five Factor Model  Sensation Seeking and religious/spiritual well-being considered  Higher levels of Neuroticism and lower levels of Openness to experience were found in both substance dependents groups  Highlight a link between polydrug dependence and problematic personality structure

Problem Identification
The existing work in licit and illicit substance consumption has been focused only on the measures taken during treatment and follow up procedure. However, identification methodologies of substance abuser is not touched upon and assessed using an appropriate and effective tool. The probability of a substance abuser being leading to any of the mental illness & to self-harm till death has not been standardized as of now. Thus it is difficult to seek for availability & acceptability of treatment of individual and community at large. There is no concrete evidence on the public data driven survey to collect a real-time and authentic data on substance and alcohol abuse and identify their predictor variables that is leading both men and women inclination towards consumption of licit or illicit substance. Quantity and quality of the prevalence of drug is known, however to identify focus group and region of attention throughout the population and use of technology and its intervention seen very minimal work. There is a dire need of primary level of screening on the mental illness like depression, schizophrenia, and dementia. It is also required to analyze these ailments in relation to drug and substance abuse.

Methodologies
Designing technological intervention requires a deep understanding of the subjects profile, their behaviour in substance consumption and other conditions. On this purview, technology will play a vital role in identifying the inherent patterns. This research methodology addressed this by incorporating elements of qualitative data collection from both normal subject and patients under treatment in rehabilitation centers.
The quantitative evaluation of ML models is performed for recognizing the relation of substance abuse and intake patterns along with its impact on mental instabilities. A field research setting is established for this research work with controlled group of people as subject of study. They are administered with set of questionnaire through informed consent process. With normal subject of study having more than 16 years of age and residing in the state of Sikkim, a sample frame of 500 college going students were taken into consideration to check the prevalence. Another sample frame of 200 patients who are under treatment for cognitive behavioral disorder due to acute consumption of licit or illicit substances as per DSM diagnostics, are taken from identified rehabilitation centers within the state of Sikkim. Research is done with 7 an assumption of 2-5% of random sample error with snow ball sampling method.
A self-reported survey is conducted within a selected population to collect data set using various standard format to understand the physiological and psychological parameters. To determine important feature that are indicators to classify some of mental illness like depression, anxiety severity and underlying other pattern nearly 515 responses were collected through Google form and hard copy questionnaire circulation among the targeted sample. In each study participant has been assessed to conclude that they are using drug/substances/ alcohol/marijuana or exposed to it once in their life time and also relate with evidence of anxiety and depression diseases as an outcome. The reported data set are used to train the ML model to analyse and reveal the outcome from the drug intake and its addictive consequences. The alcohol & drug dependencies is screened through National Institute on Drug Abuse (NIDA) guides , Alcohol Use and Disorder Identification Test (AUDIT).These are simple questionnaire based instrument to screen and identify people at risk of alcohol or substance consumption. Similarly, Generalized Anxiety Disorder 7 (GAD-7) is a self-reported multiple choice questionnaire is used for screening the severity measure of generalized anxiety disorder. For measuring the severity of depression BDI survey questionnaire are administered to the sample subjects. Since, snowball sampling technique is used thus once the prevalence and severity is identified, their correlation on illness is also analysed from the same subjects. As stated earlier, the important attributes with respect to the drug severity is sought using ML algorithms.

System design and Analysis
The overall strategy of approaching the solution is provided in figure 1. It highlights overall layout of how the problem is addressed and how data is collected from the non-clinical samples. Since the response collected are in huge numbers, analyzing each response manually and finding out the important feature with respect to substance consumption and depression encountered becomes a difficult task. A classification model that simplifies the task of segregating the response to its particular class is selected. Classification of each of the sample data into their respective class considered as basic data mining epistemology for broad multidimensional data. Various ML based models are used for the classification as discussed in the result section. Many variables are having lesser impact in the decision and known to be redundant and some variables have stronger correlation. Deciding which multivariate are crucial is also a part of mining from the set of response collected from the sample subjects. The analysis lies in finding out a subset of variables that can well define the possible outcome from the original information dataset. Thus, the classical dimension reduction techniques like Linear Discriminant Analysis (LDA), Factor Analysis(FA), Information gain using entropy are implemented in this study. The classification model selected for this purpose are on the basis of their interpretability, accuracy and scale-ability. Popular algorithms like K-Nearest Neighbor(KNN), Support Vector Machine(SVM) and Decision Tree(DT) are chosen for performing the classification with training and testing data in the ratio 80% and 20% respectively. The algorithm that segregates the responses to its respective classes with high accuracy and balanced specificity and sensitivity is used as classifier.

Figure 1. Architecture of Proposed Model
The proposed approach for analyzing data that describes the response collected from survey after selecting the features has been used for screening. Potential indicators are chosen considering the two main aspect i.e relevance with the class and feature interdependence. which results to give a subset of features that describes the sample population well. And further this helps to determine the underlying pattern on the population that caused them to respond in the manner they have responded.
The responses are collected through the manual of AUDIT, NIDA, BDI and GDA questionnaire comprises of several questioners which includes   In order to capture the underlying pattern of responses or latent variables, the response are analysed using factor analysis which is a descriptive analysis method that analyse the response by finding the latent variables and its relation of the original attributes. Feature selection is also performed using the filter method approach. Information gain theory is used for selecting features after ranking them in the order of their significant importance. Using the concept on entropy, gives subset of the features which are more relevant with the class. Final evaluation is based on the concept of entropy and linear discriminant analysis in determining discriminating features as important features of classification based on inter class distance.
Given below is the histogram of "Age" on which the survey has been conducted. The mean, median and mode have been plotted in the graph given to describe the data available, the survey is taken by the students with an age range of 16 to 25 and few are taken by people whose age is above 40.

Results and Discussion
Variables within a data set can be related for lots of reasons. For example, one variable could cause or depend on the values of another variable. One variable could be lightly associated with another variable. Two variables could depend on a third unknown variable. It can be useful in data analysis and modeling to better understand the relationships between variables. In this research, the statistical relationship between such variables is interpreted using correlation and inter class prediction is performed using classification techniques of new sample into the correct class.

Relation between the Substance intake and Mental Illness
In the analysis we performed scatter plot of pairs of variables and found out that all the principal features were positively correlated with each other. The graphs generated are as follows. The plot between AUDIT and BDI score in figure 5 shares the information of positive correlation among the individuals with instance of alcohol consumption tend to have depressive moods. On the similar note, the plot as shown below in Figure 6 also shows positive relation among the consumption and anxiety disorder.   This findings suffice the findings of various researcher [27][28][29][30][31][32][33] that during mental disorder one cannot rule out the possibilities of patient consuming the substance of any nature in any stage of their life. On the similar note, this findings also support the discussion[66] that anxiety also causes due to substance intake as depicted in the Figure 8. Though the depression index of the entire study as in Figure 9 displays the concentration towards minimal and mild classification, but there exist various parameters and features like interest, peer group motivation, first timer etc. in case of consumption of alcohol as well as illicit drugs leading to illness also proven by many hospital based studies presented in the literature under this work.

Discussion on Efficacy of ML
The outcome of this research study validated the potential of the ML approach to explain the convoluted relationships of features connected with disorder (depression and general anxiety) and substance intake habits. Questions relating to substance intake is a known factor for depression [67], high consumption of alcohol and drugs were the features of paramount rank to establish the relationship with mental illness. Crying and libido has been confined to very minimal impact on the core symptom of depression and anxiety. However, daily drinker in combination with few smokes of cannabis are the decidable factors that leads to mental illness. There are not much difference in patterns of male and female subjected to illness due to substance intake. In comparable, adequate sleep and not being restless are incidental to a ablated measure of symptoms due to substance abuse. It is still unclear about the neurobiological effects[68] of consumption and its prolonged intake. Previous research has predominantly considered creating depression clusters through various socio economic, demography and life style variables, however in this study, the main findings have a higher proportion of alcohol(51%) and cannabis (29%) to create depressive mood and anxiety . However, findings sought for more in depth probe and analysis. The ML approach has revealed Alcohol/Drug-specific multi-variables that classified survey samples to class of illness. Out of various parameters that convene to result in mental illness, drug intake and alcohol seems to have equal and common impact on illness. Important break up emerged between parameters classifying age and addiction, which often showed alternate form among individuals with mental illness and vice versa.

Limitations and Constraints
Firstly, the paper is limited to the sample survey executed in the state of Sikkim in which findings would deemed not conclusive unless similar work is done throughout the population, however finding is relevant as it is inline with the prevalence study report published by national survey of India. During the initial survey on sample subjects they were not willing to respond to the questionnaire also that many of students were absent and there is likelihood that they refrain from giving their substance use details. Another challenge is the characterization of parameters according DSM standards on manifestation of mental illness like depression and anxiety episodes for example, cause of depression due to alcohol or vice versa. Non availability of dataset is also constraint to this study. The self reported questionnaire administered on the sample population could suffer from the prejudice from the point of participants answer, data manipulation and further validation.

Recommendations
In this section, there are brief list of recommendations acquired during the design and development of this research work: a) Preliminary Screening environment:-Which includes the qualitative analysis to understand better the relevant psychological parameters associated with the substance use behavior and associated mental illness. We also found it useful to identify the common co-morbidity associated with mental illness, in order to properly discriminate them and reduce the amount of false positives classification events. b) Training of the model with more data: It is encouraging to collect more data set samples from the real life environment in order to build more robust model.

c) Security and Privacy of the information collected:
Storage and publication of private data would seek some security and privacy to be maintained.

Conclusion and Future Scope
This paper has presented a prevalence study of substance abuse in the designated demography with positive outcomes. It uses general behavioral assessment through standard questionnaire to detect early stages of anxiety and depression in person with history of substance abuse. We identified the existence of substance intake manifestation leading to mental illness like depression and anxiety episodes. The design used of ML models for a classification with accuracy of 80% . While classifying the multi variable class of parameters into different classes of representation and manifestations associated to mental illness episodes, many other features such as neurocognitive approaches, assertion of individuals addicted to a particular drug can be taken for an effective feature selection to keep them in abstinence.This work has put some light on dealing mental health issues in relation with substance addiction (Abuse) and assist future work in this field. Future endeavour inline with the current work has to focus on many of the other mental illness and co-morbidity. The ML models applied in this work needs optimization on accuracy of validation score.
In future, refined repository in terms of electronic records of patients with disorders due to substance abuse could be created based on the various models for testing personality traits, severity test models that will also gain some insights to predict any of the major health hazard of patient or normal being to react immediately for any medical requirements.