Tutorials

Dendrograms in Plotly

Dendrograms are tree-like charts used to visualize hierarchical clustering. They help show how observations or features are grouped together and how similar or dissimilar those groups are. Plotly is useful here because it makes the dendrogram interactive, which helps when you want to inspect labels and cluster structure more closely.

This tutorial shows how to create a dendrogram in Plotly, customize the layout, change orientation, and build a more practical example with readable labels.

### Basic Dendrogram

Let's start with a simple dendrogram built from a small matrix of values.

import numpy as np
import pandas as pd
import plotly.figure_factory as ff
import scipy  # required by create_dendrogram

np.random.seed(0)

labels = ["A", "B", "C", "D", "E", "F"]
data = pd.DataFrame(np.abs(np.random.randn(6, 4)), index=labels)

fig = ff.create_dendrogram(data.values, labels=labels)

fig.update_layout(
    title="Basic Dendrogram",
    width=700,
    height=450,
    showlegend=False,
)

fig.show()
- **`ff.create_dendrogram(...)`** creates the hierarchical clustering visualization.
- Branch heights represent distances between merged groups.

### Right-Oriented Dendrogram

Changing the orientation can make long labels easier to read.

import numpy as np
import pandas as pd
import plotly.figure_factory as ff
import scipy  # required by create_dendrogram

np.random.seed(1)

labels = ["Sample A", "Sample B", "Sample C", "Sample D", "Sample E", "Sample F"]
data = pd.DataFrame(np.abs(np.random.randn(6, 5)), index=labels)

fig = ff.create_dendrogram(data.values, labels=labels, orientation="right")

fig.update_layout(
    title="Right-Oriented Dendrogram",
    width=800,
    height=500,
    margin=dict(l=180, r=40, t=60, b=40),
    showlegend=False,
)

fig.show()
- **`orientation="right"`** places labels on the y-axis instead of the x-axis.
- Extra left margin helps prevent long labels from being clipped.

### Customizing Labels and Figure Layout

Layout tuning makes the chart easier to interpret when you have more observations.

import numpy as np
import pandas as pd
import plotly.figure_factory as ff
import scipy  # required by create_dendrogram

np.random.seed(2)

labels = [f"Item {i}" for i in range(1, 9)]
data = pd.DataFrame(np.abs(np.random.randn(8, 4)), index=labels)

fig = ff.create_dendrogram(data.values, labels=labels, orientation="bottom")

fig.update_layout(
    title="Customized Dendrogram Layout",
    width=850,
    height=500,
    template="plotly_white",
    xaxis=dict(tickangle=-30),
    yaxis=dict(title="Distance"),
    showlegend=False,
)

fig.show()
- Rotating tick labels can improve readability when labels are crowded.
- Adding a y-axis title helps clarify that branch height reflects clustering distance.

### Using Color Thresholds and Interpreting Branches

Dendrogram interpretation depends on branch height. Merges that happen low in the chart are more similar, while merges that happen high up indicate larger differences between groups.

import numpy as np
import pandas as pd
import plotly.figure_factory as ff
import scipy  # required by create_dendrogram

np.random.seed(3)

labels = ["North", "South", "East", "West", "Central", "Remote"]
data = pd.DataFrame(np.abs(np.random.randn(6, 4)), index=labels)

fig = ff.create_dendrogram(
    data.values,
    labels=labels,
    color_threshold=1.5,
)

fig.update_layout(
    title="Dendrogram with Color Threshold",
    width=750,
    height=450,
    yaxis=dict(title="Cluster Distance"),
    showlegend=False,
)

fig.show()
- **`color_threshold=1.5`** changes branch coloring to emphasize cluster groupings at a chosen distance.
- Lower merge heights suggest stronger similarity within those groups.

### Practical Example: Clustering Product Profiles

Here is a more practical example where each row represents a product and each column represents a measured feature. A dendrogram helps reveal which products behave similarly across those features.

import numpy as np
import pandas as pd
import plotly.figure_factory as ff
import scipy  # required by create_dendrogram

np.random.seed(4)

products = [
    "Product A",
    "Product B",
    "Product C",
    "Product D",
    "Product E",
    "Product F",
    "Product G",
]

data = pd.DataFrame(
    np.abs(np.random.randn(7, 5)),
    index=products,
    columns=["Price", "Speed", "Quality", "Durability", "Support"],
)

fig = ff.create_dendrogram(data.values, labels=products, orientation="right")

fig.update_layout(
    title="Product Profile Clustering",
    width=900,
    height=520,
    template="plotly_white",
    margin=dict(l=220, r=40, t=60, b=40),
    xaxis=dict(title="Cluster Distance"),
    showlegend=False,
)

fig.show()
- Products that merge earlier in the tree tend to have more similar feature profiles.
- This kind of chart is useful for segmentation, comparison, and exploratory analysis.

### Conclusion

Plotly dendrograms are a practical way to explore hierarchical clustering in an interactive format. By adjusting orientation, labels, and layout, you can make the cluster structure much easier to inspect and explain.