Calculus for Data Science and Machine Learning

What is a Function?

A function is a relationship between two sets that associates each element of the first set with exactly one element of the second set. The first set is called the domain, and the second set is called the range. Functions are often used to describe mathematical relationships where one quantity depends on another.

In mathematical notation, a function is typically written as f(x), where f denotes the function and x is an element from the domain. The expression f(x) represents the value of the function at the element x.

Example

Consider the function:

f(x) = x^2

This function takes an input x and squares it to produce the output.

Let's evaluate this function for a few values of x:

- If x = 1, then f(1) = 1^2 = 1
- If x = 2, then f(2) = 2^2 = 4
- If x = 3, then f(3) = 3^2 = 9

In this example, the domain of the function is all real numbers, and the range is all non-negative real numbers (since squaring any real number results in a non-negative number).

What is a Limit?

In calculus, a limit is a fundamental concept that describes the value that a function approaches as the input (or variable) approaches a certain value. Limits help us understand the behavior of functions as they get close to specific points, even if they do not actually reach those points.

Formal Definition

The limit of a function f(x) as x approaches a value c is denoted by:

lim (x -> c) f(x)

This means that as x gets arbitrarily close to c (from either side), f(x) approaches a specific value, which we call L. Formally, we write:

lim (x -> c) f(x) = L

Example

Consider the function f(x) = (x^2 - 1) / (x - 1). We want to find the limit as x approaches 1.

Evaluating the Limit

Directly substituting x = 1 into the function f(x) = (x^2 - 1) / (x - 1) results in an indeterminate form (0/0). However, we can simplify the function first:

f(x) = (x^2 - 1) / (x - 1) = [(x - 1)(x + 1)] / (x - 1)

Here, (x - 1) cancels out, leaving us with:

f(x) = x + 1

Now, we can substitute x = 1 into the simplified function:

f(1) = 1 + 1 = 2

Thus, we find that:

lim (x -> 1) f(x) = 2

In this example, as x approaches 1, the value of f(x) approaches 2. Therefore, the limit of f(x) as x approaches 1 is 2.

What is Continuity?

In calculus, a function is said to be continuous if there are no breaks, jumps, or holes in its graph. Continuity ensures that small changes in the input (x) result in small changes in the output (f(x)). Formally, a function f(x) is continuous at a point c if the following three conditions are met:

f(c) is defined.
The limit of f(x) as x approaches c exists.
The limit of f(x) as x approaches c is equal to f(c).

Formal Definition

A function f(x) is continuous at a point c if:

lim (x -> c) f(x) = f(c)

Example

Consider the function f(x) = x^2. We will check if this function is continuous at x = 2.

Checking Continuity

Is f(2) defined?

   f(2) = 2^2 = 4

Does the limit of f(x) as x approaches 2 exist?

   lim (x -> 2) f(x) = lim (x -> 2) x^2 = 2^2 = 4

Is the limit of f(x) as x approaches 2 equal to f(2)?

   lim (x -> 2) f(x) = 4 and f(2) = 4

Since all three conditions are satisfied, the function f(x) = x^2 is continuous at x = 2.

In general, polynomial functions like f(x) = x^2 are continuous for all real numbers, meaning there are no breaks, jumps, or holes in their graphs across their entire domain.

What is Differentiability?

In calculus, a function is said to be differentiable at a point if it has a well-defined tangent at that point, which means it can be represented by a derivative. Differentiability implies that the function is smooth and has no sharp corners or cusps at the given point. If a function is differentiable at a point, it is also continuous at that point, but the converse is not necessarily true.

Formal Definition

A function f(x) is differentiable at a point c if the following limit exists:

lim (h -> 0) [f(c + h) - f(c)] / h

This limit, if it exists, is the derivative of f(x) at the point c, denoted as f'(c).

Example

Consider the function f(x) = x^2. We will check if this function is differentiable at x = 2.

Checking Differentiability

To find the derivative of f(x) = x^2 at x = 2, we use the definition of the derivative:

f'(x) = lim (h -> 0) [f(x + h) - f(x)] / h

f'(2) = lim (h -> 0) [(2 + h)^2 - 2^2] / h

= lim (h -> 0) [4 + 4h + h^2 - 4] / h

= lim (h -> 0) [4h + h^2] / h

= lim (h -> 0) [4 + h]

= 4

Since the limit exists and equals 4, the function f(x) = x^2 is differentiable at x = 2, and the derivative at that point is f'(2) = 4.

In general, polynomial functions like f(x) = x^2 are differentiable for all real numbers, meaning they have well-defined tangents at every point in their domain.

What is a Derivative?

In calculus, the derivative of a function represents the rate at which the function is changing at any given point. Geometrically, it corresponds to the slope of the tangent line to the graph of the function at that point. The derivative provides important information about the behavior of a function, including its increasing or decreasing nature, concavity, and local extrema.

Formal Definition

The derivative of a function f(x) with respect to x is denoted by f'(x) or dy/dx and is defined as the limit of the average rate of change of the function as the interval approaches zero:

f'(x) = lim (h -> 0) [f(x + h) - f(x)] / h

Alternatively, if y = f(x), then the derivative dy/dx is given by:

dy/dx = lim (h -> 0) [f(x + h) - f(x)] / h

Example

Consider the function f(x) = x^2. We will find its derivative with respect to x.

Finding the Derivative

Using the definition of the derivative:

f'(x) = lim (h -> 0) [f(x + h) - f(x)] / h

= lim (h -> 0) [(x + h)^2 - x^2] / h

= lim (h -> 0) [x^2 + 2xh + h^2 - x^2] / h

= lim (h -> 0) [2x + h]

= 2x

So, the derivative of f(x) = x^2 with respect to x is f'(x) = 2x.

In this example, the derivative function f'(x) = 2x gives the slope of the tangent line to the graph of f(x) = x^2 at any point x.

Standard Derivative Formulas

Here are some standard derivative formulas for common functions:

Power Rule:

If f(x) = x^n, where n is a constant, then
f'(x) = nx^(n-1)

Exponential Rule:

If f(x) = e^x, then
f'(x) = e^x

Logarithmic Rule:

If f(x) = log_b(x), where b is the base, then
f'(x) = 1 / (x * ln(b)), where ln denotes the natural logarithm

Sine Rule:

If f(x) = sin(x), then
f'(x) = cos(x)

Cosine Rule:

If f(x) = cos(x), then
f'(x) = -sin(x)

Tangent Rule:

If f(x) = tan(x), then
f'(x) = sec^2(x)

These are some of the basic derivative formulas used in calculus. They allow us to find the derivatives of various functions efficiently.

Rules for Finding Derivatives

In calculus, there are several rules that allow us to find the derivatives of more complex functions by combining the derivatives of simpler functions. Here are some of the most commonly used rules:

Constant Rule:

If f(x) = c, where c is a constant, then
f'(x) = 0

Sum Rule:

If f(x) = g(x) + h(x), then
f'(x) = g'(x) + h'(x)

Difference Rule:

If f(x) = g(x) - h(x), then
f'(x) = g'(x) - h'(x)

Product Rule:

If f(x) = g(x) * h(x), then
f'(x) = g'(x) * h(x) + g(x) * h'(x)

Quotient Rule:

If f(x) = g(x) / h(x), then
f'(x) = (g'(x) * h(x) - g(x) * h'(x)) / h(x)^2

Chain Rule:

If f(x) = g(h(x)), then
f'(x) = g'(h(x)) * h'(x)

These rules provide a systematic way to find the derivatives of various functions by breaking them down into simpler components and applying basic differentiation techniques.

Higher Order Derivatives

In calculus, higher order derivatives refer to the derivatives of derivatives. For example, the second derivative of a function is the derivative of its first derivative, and so on.

Notation:

The first derivative of a function f(x) is denoted as f'(x) or dy/dx.
The second derivative of f(x) is denoted as f''(x) or d^2y/dx^2.
Higher order derivatives can be denoted as f^(n)(x) or d^n y/dx^n, where n is the order of the derivative.

Example:

Let's consider a function f(x) = x^3. We'll find its first, second, and third derivatives.

First Derivative:

f(x) = x^3

f'(x) = 3x^2

Second Derivative:

f'(x) = 3x^2

f''(x) = 6x

Third Derivative:

f''(x) = 6x

f'''(x) = 6

In this example, we see that each derivative introduces a degree of differentiation. The first derivative measures the rate of change of the function, the second derivative measures the rate of change of the rate of change, and so on. Higher order derivatives provide increasingly detailed information about the behavior of the function.

Maxima and Minima

In calculus, maxima and minima refer to the highest and lowest points, respectively, on a graph of a function. These points represent local extrema, meaning that they are either the highest or lowest points in a small neighborhood around them.

Definition:

Maxima: A point x = c is a local maximum of a function f(x) if there exists an open interval (a, b) containing c such that f(x) ≤ f(c) for all x in (a, c) and (c, b). In other words, the function attains its highest value at x = c within a small interval around c.
Minima: A point x = c is a local minimum of a function f(x) if there exists an open interval (a, b) containing c such that f(x) ≥ f(c) for all x in (a, c) and (c, b). In other words, the function attains its lowest value at x = c within a small interval around c.

Example:

Consider the function f(x) = x^2 - 4x + 3. We'll find its critical points and determine whether they correspond to maxima or minima.

Critical Points:

To find the critical points, we set the derivative equal to zero and solve for x:

f(x) = x^2 - 4x + 3

f'(x) = 2x - 4

Set f'(x) = 0:

2x - 4 = 0
2x = 4
x = 2

So, x = 2 is a critical point of the function.

Second Derivative Test:

To determine whether the critical point corresponds to a maximum or minimum, we use the second derivative test. If the second derivative is positive at the critical point, it's a local minimum. If it's negative, it's a local maximum.

f''(x) = 2

At x = 2:
f''(2) = 2 > 0

Since the second derivative is positive at x = 2, the function has a local minimum at this point.

In this example, we found that the function f(x) = x^2 - 4x + 3 has a local minimum at x = 2.

What is a Multivariable Function?

A multivariable function is a relationship between two or more sets that associates each element of the first set with exactly one element of the second set. In this context, the first set is often referred to as the domain, which consists of ordered pairs (or tuples) of real numbers, and the second set is called the range. Multivariable functions describe mathematical relationships where one quantity depends on two or more other quantities.

In mathematical notation, a multivariable function is typically written as f(x, y, z, ...), where f denotes the function, and x, y, z, etc., are elements from the domain. The expression f(x, y, z, ...) represents the value of the function at the elements x, y, z, etc.

Example

Consider the function:

f(x, y) = x^2 + y^2

This function takes two inputs, x and y, and returns the sum of their squares as the output.

Let's evaluate this function for a few values of x and y:

- If x = 1 and y = 1, then f(1, 1) = 1^2 + 1^2 = 1 + 1 = 2
- If x = 2 and y = 3, then f(2, 3) = 2^2 + 3^2 = 4 + 9 = 13
- If x = 0 and y = -2, then f(0, -2) = 0^2 + (-2)^2 = 0 + 4 = 4

In this example, the domain of the function is all pairs of real numbers (x, y), and the range is all non-negative real numbers (since squaring any real number results in a non-negative number).

Visual Representation

Multivariable functions can be visualized using graphs in higher dimensions. For example, the function f(x, y) = x^2 + y^2 can be visualized as a surface in three-dimensional space, where the height of the surface at each point (x, y) corresponds to the value of the function at that point.

Derivatives of Multivariable Functions

In multivariable calculus, the concept of derivatives extends to functions of more than one variable. The derivatives of multivariable functions measure how the function changes as each of its input variables changes. These derivatives are essential for understanding the behavior and properties of multivariable functions.

Partial Derivatives

The partial derivative of a multivariable function with respect to one of its variables measures the rate at which the function changes as that variable changes, while keeping the other variables constant.

Notation

For a function f(x, y), the partial derivative of f with respect to x is denoted by:

∂f/∂x

And the partial derivative of f with respect to y is denoted by:

∂f/∂y

Example

Consider the function:

f(x, y) = x^2 + y^3

To find the partial derivatives, we differentiate f with respect to each variable separately.

Partial Derivative with respect to `x`

∂f/∂x = ∂/∂x (x^2 + y^3)
       = ∂/∂x (x^2) + ∂/∂x (y^3)
       = 2x + 0
       = 2x

Partial Derivative with respect to `y`

∂f/∂y = ∂/∂y (x^2 + y^3)
       = ∂/∂y (x^2) + ∂/∂y (y^3)
       = 0 + 3y^2
       = 3y^2

So, the partial derivatives of f(x, y) are:

∂f/∂x = 2x
∂f/∂y = 3y^2

Gradient

The gradient of a multivariable function is a vector that consists of all its partial derivatives. It points in the direction of the steepest ascent of the function.

Notation

For a function f(x, y), the gradient is denoted by ∇f and is given by:

∇f = (∂f/∂x, ∂f/∂y)

Example

Using the previous function f(x, y) = x^2 + y^3, the gradient is:

∇f = (2x, 3y^2)

Gradient Descent

Gradient descent is an optimization algorithm used to minimize functions by iteratively moving towards the steepest descent direction as defined by the negative of the gradient. It is widely used in machine learning and deep learning to minimize the cost function and improve model performance.

The Concept

In the context of a multivariable function, the gradient descent algorithm seeks to find the minimum value of the function by following the direction of the negative gradient. The gradient at any point indicates the direction of the steepest ascent, so moving in the opposite direction (the negative gradient) leads to the steepest descent.

Basic Steps

Initialize: Start with an initial guess for the function's variables.
Compute Gradient: Calculate the gradient of the function at the current point.
Update Variables: Adjust the variables in the direction of the negative gradient.
Iterate: Repeat steps 2 and 3 until convergence (i.e., when the changes in the function value become negligible).

Mathematical Formulation

For a function f(x, y), the update rule for gradient descent can be written as:

x_new = x_old - α * (∂f/∂x)
y_new = y_old - α * (∂f/∂y)

Here, α (alpha) is the learning rate, a positive scalar that determines the step size.

Example

Consider the function:

f(x, y) = x^2 + y^2

This function has a global minimum at (x, y) = (0, 0). We will use gradient descent to find this minimum.

Step-by-Step Process

Initialize: Start with an initial guess, say (x_0, y_0) = (3, 4).
Compute Gradient: Calculate the partial derivatives of f(x, y).

∂f/∂x = 2x
∂f/∂y = 2y

At (3, 4):

∂f/∂x = 2 * 3 = 6
∂f/∂y = 2 * 4 = 8

Update Variables: Choose a learning rate, say α = 0.1.

x_new = 3 - 0.1 * 6 = 3 - 0.6 = 2.4
y_new = 4 - 0.1 * 8 = 4 - 0.8 = 3.2

Iterate: Repeat the process with the new values (2.4, 3.2).

Calculate the gradient at (2.4, 3.2):

∂f/∂x = 2 * 2.4 = 4.8
∂f/∂y = 2 * 3.2 = 6.4

Update the variables:

x_new = 2.4 - 0.1 * 4.8 = 2.4 - 0.48 = 1.92
y_new = 3.2 - 0.1 * 6.4 = 3.2 - 0.64 = 2.56

Repeat these steps until the values of x and y converge to (0, 0).

Convergence

The convergence of gradient descent depends on the choice of the learning rate:

Too large: The algorithm may overshoot the minimum and fail to converge.
Too small: The algorithm may converge very slowly.

Calculus for Data Science and Machine Learning

Harsh Mishra

What is a Function?

Example

What is a Limit?

Formal Definition

Example

Evaluating the Limit

What is Continuity?

Formal Definition

Example

Checking Continuity

What is Differentiability?

Formal Definition

Example

Checking Differentiability

What is a Derivative?

Formal Definition

Example

Finding the Derivative

Standard Derivative Formulas

Power Rule:

Exponential Rule:

Logarithmic Rule:

Sine Rule:

Cosine Rule:

Tangent Rule:

Rules for Finding Derivatives

Constant Rule:

Sum Rule:

Difference Rule:

Product Rule:

Quotient Rule:

Chain Rule:

Higher Order Derivatives

Notation:

Example:

First Derivative:

Second Derivative:

Third Derivative:

Maxima and Minima

Definition:

Example:

Critical Points:

Second Derivative Test:

What is a Multivariable Function?

Example

Visual Representation

Derivatives of Multivariable Functions

Partial Derivatives

Notation

Example

Partial Derivative with respect to x

Partial Derivative with respect to y

Gradient

Notation

Example

Gradient Descent

The Concept

Basic Steps

Mathematical Formulation

Example

Step-by-Step Process

Convergence

Join Our Newsletter. No Spam, Only the good stuff.

Related

Partial Derivative with respect to `x`

Partial Derivative with respect to `y`