Creating A Universal Image Crop/Scale Animation for the Web

Taking a look at implementing complex animations, and what I learned along the way.

Background

AMP has an image lightbox component that allows enlarging <img>s within a fullscreen modal. Previously, the lightbox opening/closing animation in AMP could only animate correctly when the <img> had the same aspect ratio in both the page and the lightbox. Additionally, it did not work with values of object-fit other than fill. The object-fit CSS property determines how an image renders within an <img> element. For <img>s where we could not animate, we did a simple fade-in instead.

I wanted to support animating the <img> into lightbox mode, even when the aspect ratios did not match or different values for object-position were used. This would allow a cropped version to appear in the page, with the full version in a lightbox. While implementing this as part of my work on AMP, I made the code a standalone library so developers not using AMP could benefit as well. Below is a video of what I was aiming for.

A video showing a lightbox animation, with a square cropped image going to an uncropped image. A live demo of the animation is available.

Note that we are animating the following:

  1. The position of the image
  2. The size of the image
  3. The crop of the image

The position and size of the image are fairly straightforward to animate, but the crop turns out to be more difficult to do in a performant way, which I will get into later.

TL;DR

See the conclusion.

Implementation Choices

I wanted to ensure that my solution worked well both in AMP and standalone so that others could use it outside of AMP. The code is available on GitHub.

While AMP itself uses Closure Compiler types, I decided to go with TypeScript instead. I was still able to compile using Closure Compiler to generate a minified version with a few extra steps.

Sketching out the API

While constructing the API for my animation function, I wanted to make sure I supported grouping of DOM reads and writes, to avoid forced synchronous layouts. AMP does this internally to ensure components do not spend a lot of time recalculating styles when independent components react to the same events.

To support batching with other reads / writes within AMP, the API needs to be split into two steps. The first step is to measure Elements (for example, where the images are positioned) to figure out the calculations needed to do the animation. The second step is to mutate the DOM (by applying styles) to actually run the animation. The API I came up with looks like:

function prepareImageAnimation(…) {
  … // Measure things, do calculations
  return {
    apply: function() {
      … // apply the DOM mutations
    },
    cleanup: function() {
      … // undo any effects
    },
  };
}

This still allows you to start the animation immediately if you so choose, though it is a bit clumsy for that case. Splitting up the work, allows you to start the calculations earlier (e.g. from touchstart), reducing the latency to actually start the animation. While the code is not computation heavy, the first call can be time consuming since the code to generate the animation needs to be parsed/compiled by the browser. This could potentially take more than a few of milliseconds, especially on lower-end phones.

Preparing to Animate

The first issue that I encountered was that you could crop <img>s using the object-fit CSS property, and there is no way to animate between two different values of this property. As a result, the first step was to display the original <img> in a way that I could control. The structure I came up with looked like:

<div style="
    overflow: hidden;
    display: flex;
    align-items:center;
    justify-content: center">
  <img src="…" style="width: …; height: …;">
</div>

Note that this assumes the rendered image is centered. There is some additional logic needed to handle different values of object-position, which can move the rendered image around within the cropping container.

The code to calculate the rendered image's width/height in an <img> with a given width, height and object-fit value is relatively straightforward. If you are interested, you can take a look here.

Animating Properties

Performing a smooth animation for the rendered image size and position are relatively straightforward using the CSS transform property. It is GPU accelerated and performs well on low end devices, even if the main thread is busy. The crop is, unfortunately, more involved. First, I researched what properties could animated cheaply. I remembered an article from a few years back about which properties were safe to animate. Unfortunately, I found that this has not really changed since, which means you cannot animate clip or the newer clip-path with the same performance.

The basic approach for animating clipping is not too complicated. You can scale up (or down) the container that has overflow: hidden in both the x and y directions to adjust the size. You can then also scale the contents of the container with an inverse scale to counteract the scaling on the container. The result will be a larger clipping container, with content the same size as before.

Note that the curves for the scale up and the counteracting scale are not the same. At any given time, the product of both values should be one. For example, if you have a linear scale up, f(t) = a*t, then you need f(t) = 1/(a*t) to cancel it out as the animation progresses.

I had initially planned to implement the animation for a single timing curve. I started off by using my desired specific cubic-bezier() for the scale animation. I found that for some curves, I could find another curve that counteracted the scaling fairly well, but the starts and ends of the animation could be off enough to be noticeable. That is, the start and end of the animation did not feel as natural as the middle.

I eventually found that for any given input curve, I could approximate the inverting scale using multiple curves, but finding those curves was not trivial. Given this, it made sense to support arbitrary curves instead of a predefined one. This brought me to a few options:

  1. Do the animation in JS
  2. Do the animation using Animation Worklet
  3. Generate keyframes using the Web Animations API
  4. Generate keyframes using a dynamically created stylesheet

The first option was a non-starter. I wanted to make sure the animation was smooth no matter what was running on the main thread. The second option is a work in progress API and would only help for Chrome. Since I did not want to require the use of a polyfill for the Web Animations API, the last option won by default.

Generating Keyframes

Keyframes tell the browser the values for different properties at some point within the animation. For example, take the following keyframes:

@keyframes some-name {
  0% { transform: scale(0.1); }
  20% { transform: scale(0.5); }
  100% { transform: scale(1.0); }
}

This tells the browser that it should start the animation at scale 0.1. At 20% into the animation, it should scale to 0.5. Finally, at the end, it should scale to 1.0.

The browser then interpolates between the given keyframes (using the animation-timing-function) to find the scale value at any given time. For example, given:

  • The above keyframes
  • An animation duration of 1000ms
  • A linear animation-timing-function
  • An offset of 100ms into the animation

The browser will scale to 0.3 as that is the value half way between the 0% and 20% keyframes.

To animate, I needed to generate two sets of keyframes: one for the scale of the container and one to counteract the scale. The product of these keyframes should always be one (so that the image inside is not affected by the scale for the crop). Note that the interpolated value (from the keyframes) for the scale and counter scale will not exactly multiply to one at an arbitrary point in time. I found I could get it close enough to not be noticeable during the animation, but it does cause other problems. See the lightbox demo section for more info.

In order to generate keyframes, my initial thought was to generate evenly spaced keyframes (e.g. 10%, 20%, ...) to make sure the interpolated value would be close enough at any given time. However, the formula for the cubic-bezier() function does not provide an easy way to get the output value as a function of time. Instead, it lets you calculate x/y pairs given a t parameter. The x value corresponds to a time in the animation, with the y value corresponding to the percentage between the start/end of the animation at that time. This uses the following formula:

B(t) = (1 - t)3P0 + 3(1 - t)2tP1 + 3(1 - t)t2P2 + t3P3

The browser's cubic-bezier() function always uses P0 = 0 and P3 = 1. P1 and P2 are the first/second x/y control points passed to the function.

cubic-bezier

I initially used some code that would solve for the y value for a given x value and then generated evenly spaced keyframes that way. One thing I noticed is that, depending on the input curve, there were some keyframes where not much was happening and others where a lot was happening. To visualize this, take a look at the following curve: cubic-bezier(0.8, 0, 0.2, 1). This is the curve I was developing against as it is used in AMP's lightbox animation. The output as a function of time, with evenly spaced samples, looks like:

time output
Output (as a percentage between the start and end values) over the course of an animation for a cubic bezier curve corresponding to cubic-bezier(0.8, 0, 0.2, 1), with 21 even samples across time plotted on top.

Let us look at how the points will be distributed if we sample across t with 21 samples and plot them on top of our curve:

time output
Output over the course of an animation for a cubic bezier curve corresponding to cubic-bezier(0.8, 0, 0.2, 1), with 21 samples of t plotted on top.

Note that at the start and end, the points are more spread out along the time axis. Since the output value is not changing much at these points, the amount of error that can be introduced from interpolation is also small. Also note in the center, where the output is changing more frequently, our points are closer together.

Visualizing How the Curve is Formed

The curve is a combination of how the x and y values changes with t.

t y coordinate
The value of the y-coordinate as a function of t for cubic-bezier(0.8, 0, 0.2, 1), with a y(t) = t plotted as a reference.
t x coordiate
The value of the x-coordinate as a function of t for cubic-bezier(0.8, 0, 0.2, 1), with a x(t) = t plotted as a reference.

The y-value above controls how the output changes with t. The x-value controls how rapidly we progress along the output value. Note that the x-value changes slowly in the middle, resulting in the samples being more tightly spaced together. The y-value changes smoothly over time, resulting in the samples having a similar change in output value.

See how the graphs in the section were created.

Determining the Number of Samples

To understand how many samples to generate, I first tried out a few different values to see how many samples were needed to make the animation feel fluid. Second, I checked how much additional accuracy would a given number of samples give me.

To figure out how much accuracy I got from a given number of samples, I calculated the maximum difference between the interpolated values for scale * counterScale and the expected value (if no interpolation was done). I calculated the error in the middle of each pair of keyframes. Note that I was interested mostly in the maximum error (how far can any given frame be off by) rather than the average error.

I found when using 11 samples, the animation could be off by up to 2% on any given frame, which was somewhat surprisingly noticeable to me. Increasing to 21 samples dropped the maximum error to 0.4%. Finally, I checked with 31 samples and saw it only dropped to 0.2%. I stuck with 21 samples. The animation looked fine and increasing the number of samples did not improve the accuracy by much.

Supporting Scrolling During the Animation

My initial version of the animation used position: fixed on the animating element. As a result, the animation would not behave as expected if the user scrolled during the animation, as illustrated below:

A video showing scrolling during the image animation when using position: fixed.

This was corrected by using position: absolute on the animating element. Note, when the target image is within a container that scrolls (other than <body>), the animating element should be placed in that container. For an example, see the hero animation demo.

When using position: absolute, the top and left values need to be set relative to the first ancestor with a position value that is not static (either relative, absolute, sticky or fixed) or the <html> element if no such ancestor exists. This is an element's offsetParent, with two exceptions. First, when there are no positioned ancestors, the offsetParent will be the <body>. When the <body> is not positioned, the <html> element should be used instead. Second, the offsetParent for <body> is null. In this case, the <html> element should be used as well.

Once I had the correct ancestor, I simply needed to calculate the difference between the top/left values for the target <img> and ancestor ClientRects. After making the changes, everything worked as expected:

A video showing scrolling during the image animation when using position: absolute.

Supporting object-position

While doing my initial planning / reading for this project, I overlooked the existence of the object-position CSS property. The code I had initially written assumed the rendered image would always be centered. Trying to fix this took some additional time to get right.

Three different <img> Elements with the same src, but with an object-position of left, center and right respectively.

While reading how object-position worked was fairly straightforward, I had quite a bit of trouble trying to implement it correctly. The unit tests for the replacement image were not too hard to get working, but the animation itself was still not accurate.

The main problem was that I had not completely removed my assumptions regarding the image's position. Since the scale transforms were using the default transform-origin, which centers as you scale, the image would be rendered with an incorrect position when I went to translate it for the object-position. Changing over to transform-origin: top left made things a lot easier to reason about.

This code ended up adding more complexity than I initially thought. This is reflected in an additional 400 bytes of compressed payload. In order to figure out how the rendered image was positioned inside of <img> Element, I had to parse the value of the object-position property. Fortunately, getComputedStyle() will return resolved values for CSS properties. As a result object-position: center center's computed value is 50% 50%, so I did not need to understand values like center or right.

The computed value can get a bit tricky however, since you can offset the position by a pixel (or other unit) value. For example: object-position: top 2px right 1em, is resolved to object-position: calc(-16px + 100%) 2px. To figure out the percent and pixel values I needed to move things by, I had to write some ad-hoc code, including a regex to extract out the parts. In the future, the CSS Typed Object Model will make getting this information from the browser much easier.

Testing

Ideally, I would have set up screenshot based testing for my code in addition to my other tests. This would normally be done through WebDriver or Puppeteer. I looked a bit into setting this up, but it was not as straightforward as standard unit tests. Thus, I have opted for comparing ClientRects in my tests for now. Since this has made writing the tests cumbersome, I have written fewer tests than I would like.

It was important for me to test how the animation rendered while in progress to make sure all the calculations were correct and that the keyframes were working as expected (and would and would not regress when I changed the code). To do this, I used some code that allows pausing and moving CSS animations to a desired point in time. The code is available on GitHub if you would like to use it for your own tests.

CSS animations can be controlled in testing by first stopping the animation (animation-play-state: paused), and then adding a negative animation-delay to move the animation to the desired position. This adds some bookkeeping to handle existing delays, handle animations on pseudo elements (e.g. ::before) and to handle Shadow DOM, but the code is otherwise fairly simple.

Controlling the animation from the test is pretty easy. The usage in one of my tests looks like:

import {
  setup as setupAnimations,
  tearDown as tearDownAnimations,
  offset,
} from '../testing/animation-test-controller';

describe('prepareImageAnimation', () => {
  // Stops all existing and future animations
  before(() => { setupAnimations(); });
  // Resumes all existing and future animations
  after(() => { tearDownAnimations(); });
  …
  describe('scrolling', () => {
    …
    it('should have the correct position 200ms in', () => {
      // Call the code to setup the animation. It will not actually animate as
      // it is paused.
      startAnimation(largerImg, smallerImg);
      // Moves all the animations to the desired time.
      offset(200);
      window.scrollTo(50, 75);

      // Now I can check how things are positioned and make sure they are
      // correct.
      const replacement = getIntermediateImg();
      const {top, left} = replacement.getBoundingClientRect();
      // 20% of the animation, starting from 10, going to 500 - 75 from scroll
      expect(top).to.be.closeTo(-51.5, COMPARISON_EPSILON);
      // 20% of the animation, starting from 10, going to 100 - 50 from scroll
      expect(left).to.be.closeTo(-37.5, COMPARISON_EPSILON);
    });
  …
  });
});

The tests were helpful in catching problems as I refactored and added features. Failures were hard to debug however. Screenshot based tests would have been more useful when trying to reason about test failures.

Creating Demos

After I completed my initial code, I set out to create a few demos to make sure I had made things flexible enough as well as making sure I did not miss any edge cases.

Hero Animation Demo, Fixing Scaling Issues

A video showing a hero animation, with a square cropped image going to an uncropped image.

Writing the demo itself was mostly straightforward. I used divs with position: absolute as pages. Animating from the small to the larger state went well enough. However the other direction had a noticeable jump at the end of the animation. This seems to be due to the scale, counter scale and image scale multiplications not quite lining up. This issue was caused primarily because I could not animate the crop directly.

To address this issue, I reversed the way I calculated the animation. Instead of animating from the smaller state to the larger state, I changed to always use the larger state as the reference point. This means when I do an expanding animation, I calculate the end state with everything using transform: scale(1) and then scale down the starting state. There is still some error at the start of the animation, but it tends to be less noticeable due to the smaller size.

This made the code more complicated, however running my tests on every change in the background in addition to TypeScript made the refactor less painful. In retrospect, it might have been easier to always animate from large to small, and use animation-direction: reverse instead of worrying about the direction during the calculations.

Lightbox Demo, Working Around Overflow: Auto

The basics of the lightbox demo are that the image, when in the lightbox, should cover the whole screen, using object-fit: contain to ensure the image is not cropped. When I tried the animation for the first time, I saw the browser's horizontal and vertical scrollbars flicker very rapidly during the animation when testing on desktop.

My code does the image animation using position: absolute to position the transitioning <img>, to allow the animation to follow the page scrolling. As a result, the size of the animating element can affect the overflow scrolling of the parent. It appears that the scaling may be causing the width/height of the animating Elements to go over 100% of the width/height of the body, causing the scrollbars to show up, if not already present.

To workaround this issue, I changed the demo to do the transition within a position: fixed container when the body is not already scrolling. That is, the scrollHeight is less than or equal to the clientHeight. This preserves the ability to scroll during the animation when the body is already scrollable, while not having the scrollbars appearing when it is not. Only a small amount of code is needed to handle this, but it is unfortunately something you need to be mindful of.

Image Gallery Demo, Using Different Resolution Images

I wanted to play around with using a lower resolution source for the smaller image and a higher resolution one for the larger image. One nice thing about using an animation is that it masks the latency to retrieve the higher resolution image. I decided to use the srcset attribute to list the sizes for each image, with the sizes attribute deciding which source would be picked. If you are not familiar with srcset, you can read about it here. The DOM structure looks like:

<div class="gallery">
<div class="thumbnails">
<img srcset="large-1.png 400w, small-1.png 96w" sizes="96w">
<img srcset="large-2.png 400w, small-2.png 96w" sizes="96w">
…
</div>
<div class="primary">
<img srcset="large-1.png 400w, small-1.png 96w" sizes="400w">
</div>
</div>

While performing the transition, I use the smaller sizes value. After the animation completes, I update it to the larger sizes value. When the browser sees the updated value, it downloads the higher resolution image and uses it once it has finished downloading. To improve the performance, I started downloading the larger resolution image (using a separate <img> Element) at the start of the transition.

The code for this looks like:

function loadLargerImgSrc(img, largerSize) {
  …
  const dummyImg = new Image();
  dummyImg.srcset = img.srcset;
  dummyImg.sizes = largerSize;
  …
}

window.expand = function(event) {
  const srcImg = event.target.closest('img');
  const {applyAnimation, cleanupAnimation} = prepareImageAnimation({
    srcImg,
    targetImg,
    …
  });
  // Set the srcset/sizes to use the smaller image we already have.
  targetImg.srcset = srcImg.srcset;
  targetImg.sizes = srcImg.sizes;
  // Preload the larger image before starting the animation.
  loadLargerImgSrc(srcImg, targetImgSizes);
  …
  applyAnimation();
  setTimeout(() => {
    // Setting the original (larger) sizes attribute to use the higher
    // resolution when available.
    targetImg.sizes = targetImgSizes;
    cleanupAnimation();
  }, …);
}

Next I tried starting the preload on mousedown/touchstart to give even more time to start downloading the higher resolution image. At this point I ran into an issue on Chrome: the image did not appear during the animation when when the higher resolution image was not already cached.

I initially thought this might be a bug, but a coworker noted that this behavior was actually allowed per the spec (though arguably not ideal in this case). Specifically, the user agent is allowed to pick which source to use, regardless of how things match between the srcset and sizes attributes. Chrome was preferring to use a request that was inflight over an already cached resource matching the requested size, which caused the image to be blank during the transition. Fortunately, Chrome supports the currentsrc attribute, which tells you which source from a srcset it is using for an image. Instead of applying the smaller <img>'s srcset/sizes, you can instead directly set the src. The code looks like:

if (srcImg.currentSrc) {
  targetImg.src = srcImg.currentSrc;
  targetImg.srcset = '';
  targetImg.sizes = '';
} else {
  targetImg.src = '';
  targetImg.srcset = srcImg.srcset;
  targetImg.sizes = srcImg.sizes;
}

And at the end of the animation:

targetImg.srcset = srcImg.srcset;
targetImg.sizes = targetImgSizes;

Note that we did not need to clear the src from targetImg at the end, as srcset is preferred over src.

Inline Image Expansion Demo

I created some demos to show how different values of object-position look while animating. The setup for the animation was fairly straightforward: simply place a smaller and larger version of the image on top of each other with position: absolute;. In order to animate the "height", I used a CSS transform on everything later in the DOM. It is unclear if this is a good idea (or any better than simply animating the height), since it forces a large number of elements into separate layers.

One problem with simply setting the height (or translating subsequent elements) is that the position of elements might be incorrect if elements are sized in a responsive way and you resize the window. To fix this, you could resize the container by removing position: absolute from the larger image, adding it to the smaller image and remove the transforms after the animation completes. One thing to be careful of is to not do this while any other expand/collapse animations are in-flight.

How It Went

There was a mixture of things that went well or work well and things that do not.

Positive

The animation performs very smoothly, even on an older, lower power phone. Beyond setting up the animation, I did not need to do anything to make it perform well. The animation is fluid, even when scrolling during the animation.

Positive

TypeScript + iterative tests worked well for refactoring the code. Both the types and tests caught a lot of little mistakes while moving things around, which were quick to fix since I caught issues while modifying the code.

Positive

Demos helped me flesh out use cases and make sure my API covered various different use cases. I also had fun doing the demos, which was a good change of pace from writing the code itself.

Positive

Using Closure Compiler saved ~500 bytes (~2.5KB -> ~2KB) compressed (~20%) on the output compared to Rollup + Uglify. Getting it to work with TypeScript required a bit of time (and a few workarounds) however.

Mixed

While the API supports several use cases, the code is not really composable. For example, the library does not support adding a circular crop in addition to a rectangular crop.

Mixed

The code size ended up a larger than I would have liked, coming to about 2KB min + gzip. Supporting object-position added approximately 400 bytes of payload.

Mixed

Using objects for parameters kept my code more readable/refactorable than having long parameter lists. Unfortunately, Closure Compiler does not yet seem to optimize these into normal parameters, even when it knows all the call sites. This results in more bytes of output.

Negative

Not being able to animate crop significantly increased code size/complexity/implementation time. It also caused problems (rounding issues when scaling up and causing scrollbars to flicker in lightbox demo) that had to be worked around, with additional complexity.

Negative

There are some quite sharp edges in doing the animation. For example, you need to correctly set up the container the animation is in to make sure scrolling during the animation works well.

Negative

I started down the wrong path, trying to generate a cubic-bezier() curve to counteract the scaling. I should have moved on from that approach a lot quicker than I did. More time on research might have avoided this.

Negative

Writing screenshot based tests would have been more effort than I wanted to invest to get working. The tests are harder to maintain and add to as a result.

Negative

I did not do my build in a very good way, resulting in it only working on BSD OSes currently. I did not want to use something like Gulp because I felt like I was doing something really simple and I have spent a lot of time going down that rabbit hole in the past. As a result, I just did things with shell commands. Unfortunately, I do an inplace sed command and cannot figure out how to make it work on both BSD and GNU sed.

Conclusion

The web platform is pretty powerful, letting you perform this animation in a performant way even on low end devices. At the same time, it is not always easy to actually implement performant animations. The inline image expansion demo shows how something that feels like it should be easy has a lot of pitfalls in practice. There are also some things that can be a bit tricky to implement and debug without knowing about various odds and ends of CSS.

The web does have a few rough edges still both on the Browser side as well as in tooling. For example, dynamically generating keyframe names is awkward but will be made easier when Web Animations API support becomes more widespread. Tooling still can be a bit of a hassle for more complicated setups (e.g. TypeScript + Closure compiler or screenshot based tests).

Some of the browser APIs can be opaque. I needed to include code to figure out how the image would be rendered given the container size and the object-fit / object-position properties. It would have been nice to be able to get this information from the browser itself. While not a huge amount of extra code for this one case, there are many times where the APIs fall short and we end up writing extra code as developers. These end up adding up to a lot of extra bytes we need to ship down to clients.

Overall, I am really happy with how smooth the animation turned out, even on slower devices. I hope that this sort of polish becomes more common on the web over time and does not remain solely in mobile apps. Please feel free to check out the code, the documentation, and the demos to see how to use it on your own page. It is also available via npm.

Appendix

Switching to TypeScript

I initially started my code as pure JS, then added Closure Compiler type annotations, before finally settling on TypeScript. I chose TypeScript for a few reasons:

  • Build time is fast
  • Type checking works well during refactoring
  • Editor support in Visual Studio is really good

Getting TypeScript itself working was very quick and easy to do, but getting it to work with Closure Compiler was a bit of a challenge for me. I had heard about tsickle, which could turn my TypeScript into Closure Compiler annotated code, but I had a hard time finding the documentation of how to actually use it for my non-Angular code.

It also required me to have my tsconfig.json use certain options, which prevented me from generating my desired normal build output. I had to workaround this by creating a second tsconfig for my regular build which extended the first one. In addition, I had to use a specific version of TypeScript (2.9) to use tsickle rather than using the latest version.