NLP & Deep learning | Always a learner

**The train time complexity of machine learning model**— The amount of time taken to train the model**The test time complexity of the machine learning model —**Time took to predict output for a given input query point.

Time complexity is an essential aspect to know when anyone wants their model with low latency. Let’s dive deep into details of how much time and space required by the wide variety of models to predict the output.

“Assuming training data has n points with d dimensions “

Given query point (xq), K-NN follows these steps to predict output (yq). …

- As shown in the image below. nearly 68% of the data is within 1 standard deviation (σ) from the mean (μ), nearly 95% of the data is within 2 standard deviations (σ) from the mean (μ), and nearly 99.7% of the data is within 3 standard deviations (σ) from the mean (μ).

- Internal covariate shift occurs when the statistical distribution of input data changes drastically with respect to other input data.
- When the input data distribution changes, hidden layers try to learn to adapt to the new distribution. This slows down the training process, thus taking a long time to converge to a global minimum.

- Early stopping is a regularization technique. Overtraining a model on a dataset will cause overfitting.
- Therefore it is required to stop the training when the model starts to overfit. This process of stopping the training early is called early stopping. In early stopping, hyperparameters could be no. …

**a. Data Collection**

**b. Exploratory Data Analysis**

**c. Data Preprocessing**

**d. feature engineering**

**e. Feature Selection**

**f. Model Selection and Hyperparameter Tuning**

**h. Model Evaluation and Analysis**

- Many people think machine learning only concerns train models, but in fact, there are many to follow before training our model.

It's very important to know where our model works well and where it fails. If there is a low latency requirement, definitely KNN will be a worse choice. Similarly, if data is non-linear, then choosing logistic regression is not good so let's dive deep into the discussion and find the pros and cons of models.

Untangle hypothesis testing with a detailed walkthrough

- A statistical test that gives evidence to accept or reject the null hypothesis with a sample of data from the condition which is true for the entire population.
- If we have to show two distributions are different then we prove by contradiction by assuming both distributions are the same which is our null hypothesis.

Consider C1, C2 population heights of students fom two classooms. The problem is to prove that the mean heights of C1 and C2 are the same.

**Test statistic**=The difference in population means.

Observed difference in mean (uc1-uc2) = 30

- The test statistic has to be calculated given the data. …

This blog strictly limits to code walkthrough to generate a summary using Text to text transfer transformer(T-5). If you guys are curious about how T-5 works and how it was pretrained and fine-tuned on downstream NLP tasks check out the following the blog.

**Hugging face**an open-source NLP library that made our life easy to deal with State of the art transformers just like sci-kit learn for machine learning algorithms

**TFT5**indicate Tensorflow implemented the T5 model to import PyTorch implementation just remove TF from TFT5**Model and tokenizer initialization:**Here I used pretrained T5-small for this task which has 60 million…

Interesting ideas that help you master the subject

**Deep learning **a trending word in technology for the past 6 years and hundreds of research papers have been publishing every week with new techniques to solve various Natural Language Processing, Natural Language Understanding, and computer vision tasks.

However, as a beginner, one has to be focus on the basics and need to understand how things work.

These are some interesting questions I encountered while preparing for a machine learning interview and tried to answer them.

Check out this for the first part

- Hinge Loss = max(0,1-Y*(W.X))