2024
Uniform Sampling
To uniformly sample points over a volume one needs to consider the infinitesimal volume element corresponding to that specific geometry in a given coordinate system.
More SQL
Let’s say we have the following table Users
, where user_id is the column with unique values for this table.
The goal is to find for each user, the minimum of the two values for col_a and col_b.
+-------------+----------+
| Column Name | Type |
+-------------+----------+
| user_id | int |
| col_a | int |
| col_b | int |
+-------------+----------+
To do this, we can use the LEAST
function (and its counterpart GREATEST
):
SELECT
user_id,
LEAST(col_a, col_b) AS min_val
FROM Users
Common data structures
Stack
Sliding window
This iterative technique is useful when dealing with problems that involve finding a (longest/shortest) subsequence that satisfies a given criteria in a string or an array. We maintain two pointers, left and right; initially located at the first index.
Recursion
A recursive algorithm is one which calls itself, on a subset of input. To setup a correct recursion and to avoid infinite loops, one must take the following into account:
- setup the base case
- recursive calls should lead towards the base case
Binary Trees
A binary tree is a tree data structure where every node has at most two children.
Each node can be of three categories - the root
which has no parent, the leaf
which have no children,
and the inner nodes, which have at least one child. A node can be defined as follows:
SQL Patterns
NOTE: When solving sql questions always make sure that all the required components listed below are taken care of:
ML Design
A ML design task can broadly be classified into the following components, in order of priority:
- Defining success metrics
- Data pipelines
- Backend model design
- Online model design
- Model evaluation
- Model monitoring
- Serving
Defining success metrics
IMDB movie reviews
The goal of this notebook is to understand some details about Huggingface’s dataset and transformer libraries and also as a reference point for fine-tuning a LLM model for a classification task.
Arrays
One of the most basic things that one can do with an array is to have a pointer or a set of pointers that traverse the array in some fashion. The number of pointers to book-keep depends on the question at hand. Let’s look at few sample problems to understand this concept:
2019
Combinatorics, Tail sum for expectation and Simplices
1. How many different ways are possible to have $k-tuple$ with non-negative integer values such that they sum to a given value $n$?
Absorbing Markov Chains
A Markov chain containing absorbing states is known as an absorbing Markov chain. So what is an absorbing state. In simple words, if you end up on an absorbing state you can’t go anywhere else; you are stuck there for all eternity. In other words, the probability of transition from an absorbing state $i$ to any other non-absorbing state, also called transient states, is 0.
Logistic Regression
One of the most common test case in supervised machine learning is that of classification: Given a data point, classify it into one of the many labels available
Poisson Process
Let’s imagine rain falling. One obvious parameter describing this process is the rate - whether its drizzling or pouring! Let’s now focus on a tiny patch of land and assume that the rate is constant and will term this as $\lambda$. We can describe rain as a Poisson process.
PDF of a dependent variable
``The Calculus required continuity, and continuity was supposed to require the infinitely little; but nobody could discover what the infinitely little might be."-- Bertrand Russell in Mysticism and Logic and Other Essays, The Floating Press, 1 August 2010, p.100