PyTorch: Why view() fails after transpose()

10/31/2025

Summary

transpose() makes tensors non-contiguous (logical ≠ physical order). view() only works with contiguous tensors → RuntimeError. Solution: Use reshape() instead of view()!

The Problem

Imagine you want to represent a small gradient image of 8 pixels in PyTorch as a tensor and classify it with a neural network. The easiest way is to use a 2x4 tensor, where each value represents the color intensity. 

image = torch.tensor([
    [1, 2, 3, 4],
    [5, 6, 7, 8]
])

Now you want to rotate the image by 90° before it is classified:

image_rotated = image.transpose(0, 1)

Finally, the image must be converted into a 1-dimensional tensor, because the neural network can only process such inputs:

image_flat = image_rotated.view(1, 8)

But "Oh no!", now there's an error message: 

RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces).

Why does this error occur? And how can we fix it? 

In this post you'll learn: 

How PyTorch stores tensors in memory
Why `transpose()` makes the tensor "non-contiguous"
What `contiguous()` does and when you need it
Two solution approaches

Fundamentals: How Tensors Are Stored in Memory

Tensors can represent multidimensional arrays - think of the image above, videos, or other matrices. Computer memory, however, only knows one-dimensional arrays, where all data is stored contiguously one after another. 

To store these multidimensional tensors, PyTorch uses a simple trick: It stores all data sequentially in a long chain, but remembers in a small extra memory the information (strides) about how this should be interpreted when used. 

Example:

The image from the beginning of the post is stored in memory as follows: 

The stride (4, 1) indicates the following: The first value (4) indicates how many elements you need to jump to the right in memory to reach the next row of the matrix. The second value (1) indicates how many elements you need to jump to the right to reach the next column. 

For example, to find the element at position [1, 2] (index starts at 0) (second row, third column = the 7): 

Memory-Index = Row × Stride[0] + Column × Stride[1]

Memory-Index = 1 × 4 + 2 × 1 = 6 

The element at memory position 6 is indeed 7! ✓ 

Why `transpose()` Makes the Tensor "Non-contiguous"

When we think back to the example from the beginning: When we rotated the image with the `transpose()` method, one might think that the complete data array in memory is restructured, as if we had deleted the old tensor and created a new one, which would then be stored in memory as follows: 

However, such a complete restructuring of the array would unnecessarily affect the computational load when calling the transpose() method. Instead, PyTorch takes the original memory arrangement of the tensor and only adjusts the strides. 

In our example, these are rewritten by transpose() to (1, 4), which means that PyTorch can interpret them as a 4x2 matrix when reading. 

For example, to find the element at position [2, 1] (third row, second column = the 7): 

Memory-Index = Row × Stride[0] + Column × Stride[1]
Memory-Index = 2 × 1 + 1 × 4 = 6 

The element at memory position 6 is indeed 7! ✓ 

Contiguous vs. Non-contiguous

But as nice and elegant as this stride manipulation sounds - it brings a problem with it: The logical arrangement of the tensor no longer matches the physical arrangement in memory. 

What Does "Contiguous" Mean?

A tensor is contiguous (continuous) when the order in which we logically read the elements exactly matches the order in which they are stored in memory. 

Beispiel - Contiguous Tensor:

image = torch.tensor([[1, 2, 3, 4],
                     [5, 6, 7, 8]])

print(bild.is_contiguous())  # True

Visualization :

Example - Non-contiguous Tensor: 

image_rotated = image.transpose(0, 1)
print(image_rotated)
# tensor([[1, 5],
#         [2, 6],
#         [3, 7],
#         [4, 8]])

print(image_rotated.is_contiguous())  # False

Visualization:

The core: After transpose(), the tensor still points to the same data in memory, but due to the changed strides, we read them in a different order. 

The Problem with view()

If we now want to call .view() on a tensor that is non-contiguous, as in the example at the very beginning, this RuntimeError occurs. The view() method can only process contiguous tensors because it - like transpose() - only changes the strides and not the actual data. 

image_rotated = image.transpose(0, 1)
# What view(1, 8) would expect:
# One row with the elements in logical order
# → [1, 5, 2, 6, 3, 7, 4, 8]

# What actually lies in memory:
# → [1, 2, 3, 4, 5, 6, 7, 8]

# view() cannot resolve this discrepancy!
image_rotated.view(1, 8)  # RuntimeError! ❌

Lösung 1: .contiguous()

In PyTorch, you can make a tensor contiguous again with .contiguous(): 

image = torch.tensor([
    [1, 2, 3, 4],
    [5, 6, 7, 8]
])

image_rotated = image.transpose(0, 1)

# Make tensor contiguous again
image_contiguous = image_rotated.contiguous()
print(image_contiguous.is_contiguous())
# Output: True

image_flat = image_contiguous.view(1, 8)
print(image_flat)
# Output: tensor([[1., 5., 2., 6., 3., 7., 4., 8.]])

What does .contiguous() do? 

In short: It rearranges the tensor in memory. 

Before (non-contiguous):
Memory: [1, 2, 3, 4, 5, 6, 7, 8]
Logical: [1, 5, 2, 6, 3, 7, 4, 8]

After (contiguous):
Memory: [1, 5, 2, 6, 3, 7, 4, 8]  ← Rearranged!
Logical: [1, 5, 2, 6, 3, 7, 4, 8]

But beware: Copying costs performance, especially with large tensors! 

Solution 2: reshape() Instead of view()

The more elegant and safer solution is to use reshape() instead of view(): 

image = torch.tensor([
    [1, 2, 3, 4],
    [5, 6, 7, 8]
])

image_rotated = image.transpose(0, 1)
image_flat = image_rotated.reshape(1, 8)
print(image_flat)
# Output: tensor([[1., 5., 2., 6., 3., 7., 4., 8.]])

What does reshape() do differently?

reshape() is an intelligent wrapper that automatically checks whether the tensor is contiguous:

Is it contiguous? → Behaves like view(), no copy needed (very fast!)
Is it non-contiguous? → Automatically calls contiguous() and then creates the new shape

Recommendation for production code: Use reshape() instead of view(), as it's more robust and in most cases just as performant. The difference is only measurable with very large tensors and frequent reshaping operations. 

The Most Important Things in 30 Seconds

Key Takeaways:

Tensors are always stored as 1D arrays in memory, strides define the interpretation
transpose() only changes strides (fast!), but makes the tensor non-contiguous
Contiguous: Logical reading order = physical memory order
view() only works with contiguous tensors → RuntimeError with non-contiguous
Solution 1: .contiguous().view() - copies data, then reshapes
Solution 2: reshape() - intelligent, automatic, recommended!
Performance difference usually negligible

Further Resources: 

PyTorch: Why view() fails after transpose()

10/31/2025

Summary

transpose() makes tensors non-contiguous (logical ≠ physical order). view() only works with contiguous tensors → RuntimeError. Solution: Use reshape() instead of view()!

The Problem

Imagine you want to represent a small gradient image of 8 pixels in PyTorch as a tensor and classify it with a neural network. The easiest way is to use a 2x4 tensor, where each value represents the color intensity. 

image = torch.tensor([
    [1, 2, 3, 4],
    [5, 6, 7, 8]
])

Now you want to rotate the image by 90° before it is classified:

image_rotated = image.transpose(0, 1)

Finally, the image must be converted into a 1-dimensional tensor, because the neural network can only process such inputs:

image_flat = image_rotated.view(1, 8)

But "Oh no!", now there's an error message: 

RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces).

Why does this error occur? And how can we fix it? 

In this post you'll learn: 

How PyTorch stores tensors in memory
Why `transpose()` makes the tensor "non-contiguous"
What `contiguous()` does and when you need it
Two solution approaches

Fundamentals: How Tensors Are Stored in Memory

Tensors can represent multidimensional arrays - think of the image above, videos, or other matrices. Computer memory, however, only knows one-dimensional arrays, where all data is stored contiguously one after another. 

To store these multidimensional tensors, PyTorch uses a simple trick: It stores all data sequentially in a long chain, but remembers in a small extra memory the information (strides) about how this should be interpreted when used. 

Example:

The image from the beginning of the post is stored in memory as follows: 

The stride (4, 1) indicates the following: The first value (4) indicates how many elements you need to jump to the right in memory to reach the next row of the matrix. The second value (1) indicates how many elements you need to jump to the right to reach the next column. 

For example, to find the element at position [1, 2] (index starts at 0) (second row, third column = the 7): 

Memory-Index = Row × Stride[0] + Column × Stride[1]

Memory-Index = 1 × 4 + 2 × 1 = 6 

The element at memory position 6 is indeed 7! ✓ 

Why `transpose()` Makes the Tensor "Non-contiguous"

When we think back to the example from the beginning: When we rotated the image with the `transpose()` method, one might think that the complete data array in memory is restructured, as if we had deleted the old tensor and created a new one, which would then be stored in memory as follows: 

However, such a complete restructuring of the array would unnecessarily affect the computational load when calling the transpose() method. Instead, PyTorch takes the original memory arrangement of the tensor and only adjusts the strides. 

In our example, these are rewritten by transpose() to (1, 4), which means that PyTorch can interpret them as a 4x2 matrix when reading. 

For example, to find the element at position [2, 1] (third row, second column = the 7): 

Memory-Index = Row × Stride[0] + Column × Stride[1]
Memory-Index = 2 × 1 + 1 × 4 = 6 

The element at memory position 6 is indeed 7! ✓ 

Contiguous vs. Non-contiguous

But as nice and elegant as this stride manipulation sounds - it brings a problem with it: The logical arrangement of the tensor no longer matches the physical arrangement in memory. 

What Does "Contiguous" Mean?

A tensor is contiguous (continuous) when the order in which we logically read the elements exactly matches the order in which they are stored in memory. 

Beispiel - Contiguous Tensor:

image = torch.tensor([[1, 2, 3, 4],
                     [5, 6, 7, 8]])

print(bild.is_contiguous())  # True

Visualization :

Example - Non-contiguous Tensor: 

image_rotated = image.transpose(0, 1)
print(image_rotated)
# tensor([[1, 5],
#         [2, 6],
#         [3, 7],
#         [4, 8]])

print(image_rotated.is_contiguous())  # False

Visualization:

The core: After transpose(), the tensor still points to the same data in memory, but due to the changed strides, we read them in a different order. 

The Problem with view()

If we now want to call .view() on a tensor that is non-contiguous, as in the example at the very beginning, this RuntimeError occurs. The view() method can only process contiguous tensors because it - like transpose() - only changes the strides and not the actual data. 

image_rotated = image.transpose(0, 1)
# What view(1, 8) would expect:
# One row with the elements in logical order
# → [1, 5, 2, 6, 3, 7, 4, 8]

# What actually lies in memory:
# → [1, 2, 3, 4, 5, 6, 7, 8]

# view() cannot resolve this discrepancy!
image_rotated.view(1, 8)  # RuntimeError! ❌

Lösung 1: .contiguous()

In PyTorch, you can make a tensor contiguous again with .contiguous(): 

image = torch.tensor([
    [1, 2, 3, 4],
    [5, 6, 7, 8]
])

image_rotated = image.transpose(0, 1)

# Make tensor contiguous again
image_contiguous = image_rotated.contiguous()
print(image_contiguous.is_contiguous())
# Output: True

image_flat = image_contiguous.view(1, 8)
print(image_flat)
# Output: tensor([[1., 5., 2., 6., 3., 7., 4., 8.]])

What does .contiguous() do? 

In short: It rearranges the tensor in memory. 

Before (non-contiguous):
Memory: [1, 2, 3, 4, 5, 6, 7, 8]
Logical: [1, 5, 2, 6, 3, 7, 4, 8]

After (contiguous):
Memory: [1, 5, 2, 6, 3, 7, 4, 8]  ← Rearranged!
Logical: [1, 5, 2, 6, 3, 7, 4, 8]

But beware: Copying costs performance, especially with large tensors! 

Solution 2: reshape() Instead of view()

The more elegant and safer solution is to use reshape() instead of view(): 

image = torch.tensor([
    [1, 2, 3, 4],
    [5, 6, 7, 8]
])

image_rotated = image.transpose(0, 1)
image_flat = image_rotated.reshape(1, 8)
print(image_flat)
# Output: tensor([[1., 5., 2., 6., 3., 7., 4., 8.]])

What does reshape() do differently?

reshape() is an intelligent wrapper that automatically checks whether the tensor is contiguous:

Is it contiguous? → Behaves like view(), no copy needed (very fast!)
Is it non-contiguous? → Automatically calls contiguous() and then creates the new shape

Recommendation for production code: Use reshape() instead of view(), as it's more robust and in most cases just as performant. The difference is only measurable with very large tensors and frequent reshaping operations. 

The Most Important Things in 30 Seconds

Key Takeaways:

Tensors are always stored as 1D arrays in memory, strides define the interpretation
transpose() only changes strides (fast!), but makes the tensor non-contiguous
Contiguous: Logical reading order = physical memory order
view() only works with contiguous tensors → RuntimeError with non-contiguous
Solution 1: .contiguous().view() - copies data, then reshapes
Solution 2: reshape() - intelligent, automatic, recommended!
Performance difference usually negligible

Further Resources: