Hyperbolic Tangent Activation Function

Hyperbolic Tangent Activation Function

Easy:

Imagine you’re playing a video game where you fight monsters. You need to decide how strong each attack is based on how close you are to the monster. A normal activation function might be like a bulldozer — it squishes all the numbers between positive and negative infinity into either really strong (positive) or really weak (negative) attacks.

The Hyperbolic Tangent function, also called Tanh for short, is like a more nuanced fighter. Here’s how it works:

  • Getting closer, stronger attack! The closer you get to the monster (positive numbers), the stronger your attack gets, but it doesn’t go infinitely strong. It reaches a maximum strength, like your strongest move in the game.

  • Getting farther, weaker attack! The farther you get from the monster (negative numbers), the weaker your attack gets, but it doesn’t go infinitely weak. It reaches a minimum weakness, like a little tap.

  • Right next to the monster? Balanced attack! If you’re right next to the monster (zero), your attack is medium strength, like a basic punch.

Tanh helps the computer program understand the relationship between the distance (numbers) and the attack strength. It doesn’t just turn things on or off completely, but creates a smooth scale based on how close you are. This is useful in many situations where things aren’t just strong or weak, but have different levels in between.

Monster

Moderate:

Imagine you’re playing with a squishy toy snake. When you first pick it up, it’s straight and maybe a bit stiff. But as you squeeze it gently, it starts to bend and curve, becoming more flexible and able to twist and turn in interesting ways. Then, if you really squeeze it tight, it becomes almost flat, lying straight along the ground.

Now, let’s think about a special kind of math function that’s a lot like how your snake behaves. This function is called the “Hyperbolic Tangent,” often shortened to “tanh.” It’s used in deep learning, which is a way for computers to learn new things from lots of data.

How Tanh Works

The tanh function takes a number as input and gives you a squishy range of outputs between -1 and 1. Here’s how it works:

  1. When the Input is Small: If you give it a small number, it squishes the output close to 0. So, if you input 0, the output is also 0. This is like when your snake is relaxed and not being squeezed too hard.

  2. As the Input Gets Bigger: As you keep increasing the input, the output starts to get bigger, but it stays between -1 and 1. So, if you input a big positive number, the output gets closer to 1, and if you input a big negative number, the output gets closer to -1. This is like squeezing the snake harder and harder; it bends more and more until it’s almost flat.

  3. Symmetry and Squishing: No matter if the input is positive or negative, the tanh function always gives you an output that’s the same distance from 0 but on opposite sides. And it always keeps everything squished between -1 and 1, just like how your snake can only bend so far before it’s flat.

Why Use Tanh in Deep Learning?

In deep learning, activation functions like tanh are used to help computers learn from data. They determine whether a neuron (a part of the computer’s brain) should be active or not based on its input. By squishing the input into a nice, manageable range (-1 to 1), tanh helps the computer work with numbers that are easier to handle and interpret.

It’s especially good at keeping the outputs balanced around 0, which can be helpful when the computer is trying to figure out differences between things. For example, if it’s learning to tell apart happy faces from sad faces in pictures, having the outputs close to 0 for neutral expressions can make it easier to distinguish between happiness and sadness.

Conclusion

So, the hyperbolic tangent activation function is like a squishy snake in the world of deep learning. It takes inputs and squishes them into a useful range, helping computers learn and make sense of the data they’re working with. Just like how your snake can show you a lot about flexibility and balance by bending and twisting

Hard:

Imagine you are trying to find a specific toy in a big room filled with lots of toys. You have a special tool that helps you focus on the most important toys. This tool is called the Hyperbolic Tangent (tanh) activation function.

How it Works

  1. Reference Points: The tool starts by creating a grid of reference points on the map. These points are like markers that help you find the toys.

  2. Offsets: The tool then calculates offsets for each reference point. These offsets are like small adjustments that help you move the points to the right locations.

  3. Deformed Points: The tool uses the offsets to shift the reference points to the deformed points. These deformed points are like the new locations where you need to look for the toys.

  4. Sampling Features: The tool then samples features from the map at the deformed points. These features are like the details about each toy.

  5. Key and Value Projections: The tool projects the sampled features into key and value vectors. These vectors are like the descriptions of each toy.

  6. Multi-Head Attention: The tool uses multi-head attention to combine the key and value vectors. This is like comparing the descriptions of each toy to find the most important ones.

  7. Relative Position Bias: The tool also uses the deformed points to calculate relative position bias. This is like understanding how the toys are related to each other.

By using this Hyperbolic Tangent activation function, you can efficiently focus on the most important toys in the room and find them quickly. This is similar to how deep learning models use the tanh function to focus on the most important features in images and text.

Key Points

  • Tanh is similar to Sigmoid: Both functions have the same S-shape, but tanh produces outputs in the range -1 to 1, while sigmoid produces outputs in the range 0 to 1.

  • Tanh is used in deep learning: It is commonly used in neural networks, especially in recurrent neural networks (RNNs) and long short-term memory (LSTM) networks.

  • Tanh helps with vanishing gradients: It can help prevent the vanishing gradient problem by producing larger gradients than sigmoid.

  • Tanh is used for symmetric outputs: It produces symmetric outputs around zero, which can lead to faster convergence during training.

Implementation

The tanh function can be implemented in Python using the math library:

```python

import math

def tanh(x):

t = (math.exp(x) — math.exp(-x)) / (math.exp(x) + math.exp(-x))

return t

```

### Example

Here is an example of how to use the tanh function in a neural network:

```python

import numpy as np

# Define the input data

inputs = np.array([-4, -3, -2, 0, 2, 3, 4])

# Calculate the outputs

outputs = [tanh(x) for x in inputs]

# Print the outputs

for i, output in enumerate(outputs):

print(f”Input: {inputs[i]}, Output: {output}”)

```

This example calculates the tanh of a range of input values and prints the results.

If you want you can support me: https://buymeacoffee.com/abhi83540

If you want such articles in your email inbox you can subscribe to my newsletter: https://abhishekkumarpandey.substack.com/

A few books on deep learning that I am reading: