FindingData
DashBoard - DashB.ai π π
DashBoard - DashB.ai π π
A No code machine learning platform
π Overview
- This is a web app that automates the data preprocessing pipeline.Target is to automate the whole machine learning pipeline.But this project is final till data preprocessing pipeline.
- Currently this project is in developement phase.
- User can upload comma seperated value files or directly fetch the data from mysql database.(Make sure mysql is installed in your system).
- User's have all the command what to perform and what to not so selected operations can be passed to the pipeline to showcase the result.
- User's can visualize the data using dataviz tool comes along with Dash.ai which can visualize the data without writing any code. (Made by Dash by plotly)
β Built With
scikit learn
plotly
Dash
bootstrap
π’ Getting Started
To get a local copy up and running follow these simple steps. make sure git is installed in yout machine.
Installation
- Clone the repo
git clone https://github.com/IMsumitkumar/No-code-ML-platform-DashB.ai
- create a virtual env and activate
conda create -n <env_name> python=3.7conda activate <env_name>
- Install dependencies
inside-your-local-repository
pip install -r requirements.txt
RUN
STEP 1 : Migrate the databse tables and create superuser
python manage.py makemigrationspython manage.py migratepython manage.py createsuperuser username : ***** email : ***** password : ******
STEP 2
python manage.py runserver
STEP 3 : OPTIONAL
For email recovery you have to set our credentials in DashB -> settings.py
Set your email and password in DashB/settings.py
Preprocessing Pipeline Tree
βββ Handle Datatypesβ βββ Drop unnecessary features.β βββ replace inf with NaN.β βββ Make sure all the column names are of string type and clean them.β βββ Remove the column if target column has NaN.β βββ Remove Duplicate columnsβ βββ handle numerical, catergorical and time features.β βββ Try to determine Ml usecase and encode.βββ Handle Missing Valuesβ βββββββ Numerical Featuresβ βββ Replace with mean.β βββ Replace with median.β βββ Repalce with Mode.β βββ Replace with standard deviation.β βββ Replace with zero.β βββββββ Categorical Featuresβ βββ Replace with mean.β βββ Replace with "Missing".β βββ Repalce with Most frequent value.βββ Removing zero and near zero variance columnsβ βββ Eliminate the features that have zero varinace,β βββ Eliminate the features that have near zero variace.βββ Group Similiar Featuresβ βββ Group more than two features Make new features with them.βββ Normalization and Transformationβ βββββββ Operations to apply only on numerical featuresβ βββ ZScoreβ βββ MinMaxβ βββ Quantileβ βββ MaxAbsβ βββ Yeo-Johnsonβ βββββββ Target t7ransformation (regression)β βββ Box-Coxβ βββ Yeo-Johnsonβββ Making Time Featuresβ βββ Take a time feature and extract more features from itβ βββ (Day, Month, Year, Hour, Minute, Second, Quantile, Quarter, Day of week, week day name, day of year, week of year )βββ Feature Encodingβ βββββββ Ordinal Encodingβ βββ LabelEncodingβ βββ Target Guided ordinal encodingβ βββββββ One hot encodingβ βββ KDD orangeβ βββ Mean Encodingβ βββ Counter/frequency encodingβββ Removing Outliersβ βββ Isolaton Forestβ βββ KNNβ βββ PCAβ βββ Elliptical envelopeβββ Feature Selectionβ βββ Chi squared (Not working perfectly)β βββ RFE (Not working on all the data)β βββ Lasso (works perfectly)β βββ Random Forestβ βββ lgbm (works perfectly)β βββ Remove zero variance featuresβββ Imbalance Dataset (Not done yet)β βββ Ensemble techniques automatically handles imblance datasetβ βββ Undersampling (Not a good idea)β βββ Oversampling β βββ SMOTEβ βββ Isolation ForestβββNExt Step
Directory Tree
βββ accounts β ββββββββββββ # handles login, signup and password recovery. βββ DashBβ ββββββββββββ # main folder contains wsgi, routing, settings and urls.βββ dataβ ββββββββββββ # main folder for performing pipeline.βββ Vizβ ββββββββββββ # project app for data visualizatio tool.βββ staticβ ββββββββββββ # contains static files.βββ mediaβ ββββββββββββ # storage folder of uploaded media.βββ templatesβ ββββββββββββ # contains landing page templatesβββ manage.pyβββ requirements.txtβββ LICENSEβββ README.mdβββ db.sqlite3
Contributing
Contributions are what make the open source community such an amazing place to be learn, inspire, and create. Any contributions you make are greatly appreciated.
- Fork the Project
https://github.com/IMsumitkumar/No-code-ML-platform-DashB.ai/tree/main/DashB
- Create your Feature Branch
git checkout -b feature/AmazingFeature
- Commit your Changes
git commit -m 'Add some AmazingFeature'
- Push to the Branch
git push origin feature/AmazingFeature
- Open a Pull Request
Edit this page on GitHub