Calculus for Data Science and Machine Learning

harshm03

Harsh Mishra

Posted on May 31, 2024

Calculus for Data Science and Machine Learning

What is a Function?

A function is a relationship between two sets that associates each element of the first set with exactly one element of the second set. The first set is called the domain, and the second set is called the range. Functions are often used to describe mathematical relationships where one quantity depends on another.

In mathematical notation, a function is typically written as f(x), where f denotes the function and x is an element from the domain. The expression f(x) represents the value of the function at the element x.

Example

Consider the function:

f(x) = x^2
Enter fullscreen mode Exit fullscreen mode

This function takes an input x and squares it to produce the output.

Let's evaluate this function for a few values of x:

- If x = 1, then f(1) = 1^2 = 1
- If x = 2, then f(2) = 2^2 = 4
- If x = 3, then f(3) = 3^2 = 9
Enter fullscreen mode Exit fullscreen mode

In this example, the domain of the function is all real numbers, and the range is all non-negative real numbers (since squaring any real number results in a non-negative number).

What is a Limit?

In calculus, a limit is a fundamental concept that describes the value that a function approaches as the input (or variable) approaches a certain value. Limits help us understand the behavior of functions as they get close to specific points, even if they do not actually reach those points.

Formal Definition

The limit of a function f(x) as x approaches a value c is denoted by:

lim (x -> c) f(x)
Enter fullscreen mode Exit fullscreen mode

This means that as x gets arbitrarily close to c (from either side), f(x) approaches a specific value, which we call L. Formally, we write:

lim (x -> c) f(x) = L
Enter fullscreen mode Exit fullscreen mode

Example

Consider the function f(x) = (x^2 - 1) / (x - 1). We want to find the limit as x approaches 1.

Evaluating the Limit

Directly substituting x = 1 into the function f(x) = (x^2 - 1) / (x - 1) results in an indeterminate form (0/0). However, we can simplify the function first:

f(x) = (x^2 - 1) / (x - 1) = [(x - 1)(x + 1)] / (x - 1)

Here, (x - 1) cancels out, leaving us with:

f(x) = x + 1

Now, we can substitute x = 1 into the simplified function:

f(1) = 1 + 1 = 2

Thus, we find that:

lim (x -> 1) f(x) = 2
Enter fullscreen mode Exit fullscreen mode

In this example, as x approaches 1, the value of f(x) approaches 2. Therefore, the limit of f(x) as x approaches 1 is 2.

What is Continuity?

In calculus, a function is said to be continuous if there are no breaks, jumps, or holes in its graph. Continuity ensures that small changes in the input (x) result in small changes in the output (f(x)). Formally, a function f(x) is continuous at a point c if the following three conditions are met:

  1. f(c) is defined.
  2. The limit of f(x) as x approaches c exists.
  3. The limit of f(x) as x approaches c is equal to f(c).

Formal Definition

A function f(x) is continuous at a point c if:

lim (x -> c) f(x) = f(c)
Enter fullscreen mode Exit fullscreen mode

Example

Consider the function f(x) = x^2. We will check if this function is continuous at x = 2.

Checking Continuity

  1. Is f(2) defined?
   f(2) = 2^2 = 4
Enter fullscreen mode Exit fullscreen mode
  1. Does the limit of f(x) as x approaches 2 exist?
   lim (x -> 2) f(x) = lim (x -> 2) x^2 = 2^2 = 4
Enter fullscreen mode Exit fullscreen mode
  1. Is the limit of f(x) as x approaches 2 equal to f(2)?
   lim (x -> 2) f(x) = 4 and f(2) = 4
Enter fullscreen mode Exit fullscreen mode

Since all three conditions are satisfied, the function f(x) = x^2 is continuous at x = 2.

In general, polynomial functions like f(x) = x^2 are continuous for all real numbers, meaning there are no breaks, jumps, or holes in their graphs across their entire domain.

What is Differentiability?

In calculus, a function is said to be differentiable at a point if it has a well-defined tangent at that point, which means it can be represented by a derivative. Differentiability implies that the function is smooth and has no sharp corners or cusps at the given point. If a function is differentiable at a point, it is also continuous at that point, but the converse is not necessarily true.

Formal Definition

A function f(x) is differentiable at a point c if the following limit exists:

lim (h -> 0) [f(c + h) - f(c)] / h
Enter fullscreen mode Exit fullscreen mode

This limit, if it exists, is the derivative of f(x) at the point c, denoted as f'(c).

Example

Consider the function f(x) = x^2. We will check if this function is differentiable at x = 2.

Checking Differentiability

To find the derivative of f(x) = x^2 at x = 2, we use the definition of the derivative:

f'(x) = lim (h -> 0) [f(x + h) - f(x)] / h

f'(2) = lim (h -> 0) [(2 + h)^2 - 2^2] / h

= lim (h -> 0) [4 + 4h + h^2 - 4] / h

= lim (h -> 0) [4h + h^2] / h

= lim (h -> 0) [4 + h]

= 4
Enter fullscreen mode Exit fullscreen mode

Since the limit exists and equals 4, the function f(x) = x^2 is differentiable at x = 2, and the derivative at that point is f'(2) = 4.

In general, polynomial functions like f(x) = x^2 are differentiable for all real numbers, meaning they have well-defined tangents at every point in their domain.

What is a Derivative?

In calculus, the derivative of a function represents the rate at which the function is changing at any given point. Geometrically, it corresponds to the slope of the tangent line to the graph of the function at that point. The derivative provides important information about the behavior of a function, including its increasing or decreasing nature, concavity, and local extrema.

Formal Definition

The derivative of a function f(x) with respect to x is denoted by f'(x) or dy/dx and is defined as the limit of the average rate of change of the function as the interval approaches zero:

f'(x) = lim (h -> 0) [f(x + h) - f(x)] / h
Enter fullscreen mode Exit fullscreen mode

Alternatively, if y = f(x), then the derivative dy/dx is given by:

dy/dx = lim (h -> 0) [f(x + h) - f(x)] / h
Enter fullscreen mode Exit fullscreen mode

Example

Consider the function f(x) = x^2. We will find its derivative with respect to x.

Finding the Derivative

Using the definition of the derivative:

f'(x) = lim (h -> 0) [f(x + h) - f(x)] / h

= lim (h -> 0) [(x + h)^2 - x^2] / h

= lim (h -> 0) [x^2 + 2xh + h^2 - x^2] / h

= lim (h -> 0) [2x + h]

= 2x
Enter fullscreen mode Exit fullscreen mode

So, the derivative of f(x) = x^2 with respect to x is f'(x) = 2x.

In this example, the derivative function f'(x) = 2x gives the slope of the tangent line to the graph of f(x) = x^2 at any point x.

Standard Derivative Formulas

Here are some standard derivative formulas for common functions:

Power Rule:

If f(x) = x^n, where n is a constant, then
f'(x) = nx^(n-1)
Enter fullscreen mode Exit fullscreen mode

Exponential Rule:

If f(x) = e^x, then
f'(x) = e^x
Enter fullscreen mode Exit fullscreen mode

Logarithmic Rule:

If f(x) = log_b(x), where b is the base, then
f'(x) = 1 / (x * ln(b)), where ln denotes the natural logarithm
Enter fullscreen mode Exit fullscreen mode

Sine Rule:

If f(x) = sin(x), then
f'(x) = cos(x)
Enter fullscreen mode Exit fullscreen mode

Cosine Rule:

If f(x) = cos(x), then
f'(x) = -sin(x)
Enter fullscreen mode Exit fullscreen mode

Tangent Rule:

If f(x) = tan(x), then
f'(x) = sec^2(x)
Enter fullscreen mode Exit fullscreen mode

These are some of the basic derivative formulas used in calculus. They allow us to find the derivatives of various functions efficiently.

Rules for Finding Derivatives

In calculus, there are several rules that allow us to find the derivatives of more complex functions by combining the derivatives of simpler functions. Here are some of the most commonly used rules:

Constant Rule:

If f(x) = c, where c is a constant, then
f'(x) = 0
Enter fullscreen mode Exit fullscreen mode

Sum Rule:

If f(x) = g(x) + h(x), then
f'(x) = g'(x) + h'(x)
Enter fullscreen mode Exit fullscreen mode

Difference Rule:

If f(x) = g(x) - h(x), then
f'(x) = g'(x) - h'(x)
Enter fullscreen mode Exit fullscreen mode

Product Rule:

If f(x) = g(x) * h(x), then
f'(x) = g'(x) * h(x) + g(x) * h'(x)
Enter fullscreen mode Exit fullscreen mode

Quotient Rule:

If f(x) = g(x) / h(x), then
f'(x) = (g'(x) * h(x) - g(x) * h'(x)) / h(x)^2
Enter fullscreen mode Exit fullscreen mode

Chain Rule:

If f(x) = g(h(x)), then
f'(x) = g'(h(x)) * h'(x)
Enter fullscreen mode Exit fullscreen mode

These rules provide a systematic way to find the derivatives of various functions by breaking them down into simpler components and applying basic differentiation techniques.

Higher Order Derivatives

In calculus, higher order derivatives refer to the derivatives of derivatives. For example, the second derivative of a function is the derivative of its first derivative, and so on.

Notation:

  • The first derivative of a function f(x) is denoted as f'(x) or dy/dx.
  • The second derivative of f(x) is denoted as f''(x) or d^2y/dx^2.
  • Higher order derivatives can be denoted as f^(n)(x) or d^n y/dx^n, where n is the order of the derivative.

Example:

Let's consider a function f(x) = x^3. We'll find its first, second, and third derivatives.

First Derivative:

f(x) = x^3

f'(x) = 3x^2
Enter fullscreen mode Exit fullscreen mode

Second Derivative:

f'(x) = 3x^2

f''(x) = 6x
Enter fullscreen mode Exit fullscreen mode

Third Derivative:

f''(x) = 6x

f'''(x) = 6
Enter fullscreen mode Exit fullscreen mode

In this example, we see that each derivative introduces a degree of differentiation. The first derivative measures the rate of change of the function, the second derivative measures the rate of change of the rate of change, and so on. Higher order derivatives provide increasingly detailed information about the behavior of the function.

Maxima and Minima

In calculus, maxima and minima refer to the highest and lowest points, respectively, on a graph of a function. These points represent local extrema, meaning that they are either the highest or lowest points in a small neighborhood around them.

Definition:

  • Maxima: A point x = c is a local maximum of a function f(x) if there exists an open interval (a, b) containing c such that f(x) ≤ f(c) for all x in (a, c) and (c, b). In other words, the function attains its highest value at x = c within a small interval around c.

  • Minima: A point x = c is a local minimum of a function f(x) if there exists an open interval (a, b) containing c such that f(x) ≥ f(c) for all x in (a, c) and (c, b). In other words, the function attains its lowest value at x = c within a small interval around c.

Example:

Consider the function f(x) = x^2 - 4x + 3. We'll find its critical points and determine whether they correspond to maxima or minima.

Critical Points:

To find the critical points, we set the derivative equal to zero and solve for x:

f(x) = x^2 - 4x + 3

f'(x) = 2x - 4

Set f'(x) = 0:

2x - 4 = 0
2x = 4
x = 2
Enter fullscreen mode Exit fullscreen mode

So, x = 2 is a critical point of the function.

Second Derivative Test:

To determine whether the critical point corresponds to a maximum or minimum, we use the second derivative test. If the second derivative is positive at the critical point, it's a local minimum. If it's negative, it's a local maximum.

f''(x) = 2

At x = 2:
f''(2) = 2 > 0
Enter fullscreen mode Exit fullscreen mode

Since the second derivative is positive at x = 2, the function has a local minimum at this point.

In this example, we found that the function f(x) = x^2 - 4x + 3 has a local minimum at x = 2.

What is a Multivariable Function?

A multivariable function is a relationship between two or more sets that associates each element of the first set with exactly one element of the second set. In this context, the first set is often referred to as the domain, which consists of ordered pairs (or tuples) of real numbers, and the second set is called the range. Multivariable functions describe mathematical relationships where one quantity depends on two or more other quantities.

In mathematical notation, a multivariable function is typically written as f(x, y, z, ...), where f denotes the function, and x, y, z, etc., are elements from the domain. The expression f(x, y, z, ...) represents the value of the function at the elements x, y, z, etc.

Example

Consider the function:

f(x, y) = x^2 + y^2
Enter fullscreen mode Exit fullscreen mode

This function takes two inputs, x and y, and returns the sum of their squares as the output.

Let's evaluate this function for a few values of x and y:

- If x = 1 and y = 1, then f(1, 1) = 1^2 + 1^2 = 1 + 1 = 2
- If x = 2 and y = 3, then f(2, 3) = 2^2 + 3^2 = 4 + 9 = 13
- If x = 0 and y = -2, then f(0, -2) = 0^2 + (-2)^2 = 0 + 4 = 4
Enter fullscreen mode Exit fullscreen mode

In this example, the domain of the function is all pairs of real numbers (x, y), and the range is all non-negative real numbers (since squaring any real number results in a non-negative number).

Visual Representation

Multivariable functions can be visualized using graphs in higher dimensions. For example, the function f(x, y) = x^2 + y^2 can be visualized as a surface in three-dimensional space, where the height of the surface at each point (x, y) corresponds to the value of the function at that point.

Derivatives of Multivariable Functions

In multivariable calculus, the concept of derivatives extends to functions of more than one variable. The derivatives of multivariable functions measure how the function changes as each of its input variables changes. These derivatives are essential for understanding the behavior and properties of multivariable functions.

Partial Derivatives

The partial derivative of a multivariable function with respect to one of its variables measures the rate at which the function changes as that variable changes, while keeping the other variables constant.

Notation

For a function f(x, y), the partial derivative of f with respect to x is denoted by:

∂f/∂x
Enter fullscreen mode Exit fullscreen mode

And the partial derivative of f with respect to y is denoted by:

∂f/∂y
Enter fullscreen mode Exit fullscreen mode

Example

Consider the function:

f(x, y) = x^2 + y^3
Enter fullscreen mode Exit fullscreen mode

To find the partial derivatives, we differentiate f with respect to each variable separately.

Partial Derivative with respect to x

∂f/∂x = ∂/∂x (x^2 + y^3)
       = ∂/∂x (x^2) + ∂/∂x (y^3)
       = 2x + 0
       = 2x
Enter fullscreen mode Exit fullscreen mode

Partial Derivative with respect to y

∂f/∂y = ∂/∂y (x^2 + y^3)
       = ∂/∂y (x^2) + ∂/∂y (y^3)
       = 0 + 3y^2
       = 3y^2
Enter fullscreen mode Exit fullscreen mode

So, the partial derivatives of f(x, y) are:

∂f/∂x = 2x
∂f/∂y = 3y^2
Enter fullscreen mode Exit fullscreen mode

Gradient

The gradient of a multivariable function is a vector that consists of all its partial derivatives. It points in the direction of the steepest ascent of the function.

Notation

For a function f(x, y), the gradient is denoted by ∇f and is given by:

∇f = (∂f/∂x, ∂f/∂y)
Enter fullscreen mode Exit fullscreen mode

Example

Using the previous function f(x, y) = x^2 + y^3, the gradient is:

∇f = (2x, 3y^2)
Enter fullscreen mode Exit fullscreen mode

Gradient Descent

Gradient descent is an optimization algorithm used to minimize functions by iteratively moving towards the steepest descent direction as defined by the negative of the gradient. It is widely used in machine learning and deep learning to minimize the cost function and improve model performance.

The Concept

In the context of a multivariable function, the gradient descent algorithm seeks to find the minimum value of the function by following the direction of the negative gradient. The gradient at any point indicates the direction of the steepest ascent, so moving in the opposite direction (the negative gradient) leads to the steepest descent.

Basic Steps

  1. Initialize: Start with an initial guess for the function's variables.
  2. Compute Gradient: Calculate the gradient of the function at the current point.
  3. Update Variables: Adjust the variables in the direction of the negative gradient.
  4. Iterate: Repeat steps 2 and 3 until convergence (i.e., when the changes in the function value become negligible).

Mathematical Formulation

For a function f(x, y), the update rule for gradient descent can be written as:

x_new = x_old - α * (∂f/∂x)
y_new = y_old - α * (∂f/∂y)
Enter fullscreen mode Exit fullscreen mode

Here, α (alpha) is the learning rate, a positive scalar that determines the step size.

Example

Consider the function:

f(x, y) = x^2 + y^2
Enter fullscreen mode Exit fullscreen mode

This function has a global minimum at (x, y) = (0, 0). We will use gradient descent to find this minimum.

Step-by-Step Process

  1. Initialize: Start with an initial guess, say (x_0, y_0) = (3, 4).
  2. Compute Gradient: Calculate the partial derivatives of f(x, y).
∂f/∂x = 2x
∂f/∂y = 2y
Enter fullscreen mode Exit fullscreen mode

At (3, 4):

∂f/∂x = 2 * 3 = 6
∂f/∂y = 2 * 4 = 8
Enter fullscreen mode Exit fullscreen mode
  1. Update Variables: Choose a learning rate, say α = 0.1.
x_new = 3 - 0.1 * 6 = 3 - 0.6 = 2.4
y_new = 4 - 0.1 * 8 = 4 - 0.8 = 3.2
Enter fullscreen mode Exit fullscreen mode
  1. Iterate: Repeat the process with the new values (2.4, 3.2).

Calculate the gradient at (2.4, 3.2):

∂f/∂x = 2 * 2.4 = 4.8
∂f/∂y = 2 * 3.2 = 6.4
Enter fullscreen mode Exit fullscreen mode

Update the variables:

x_new = 2.4 - 0.1 * 4.8 = 2.4 - 0.48 = 1.92
y_new = 3.2 - 0.1 * 6.4 = 3.2 - 0.64 = 2.56
Enter fullscreen mode Exit fullscreen mode

Repeat these steps until the values of x and y converge to (0, 0).

Convergence

The convergence of gradient descent depends on the choice of the learning rate:

  • Too large: The algorithm may overshoot the minimum and fail to converge.
  • Too small: The algorithm may converge very slowly.
💖 💪 🙅 🚩
harshm03
Harsh Mishra

Posted on May 31, 2024

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related