As a courtesy, this is a full free rendering of my book, Programming iOS 6, by Matt Neuburg. Copyright 2013 Matt Neuburg. Please note that this edition is outdated; the current books are iOS 13 Programming Fundamentals with Swift and Programming iOS 13. If my work has been of help to you, please consider purchasing one or both of them, or you can reward me through PayPal at http://www.paypal.me/mattneub. Thank you!

Chapter 18. Touches

[Winifred the Woebegone illustrates hit-testing:] Hey nonny nonny, is it you? — Hey nonny nonny nonny no! — Hey nonny nonny, is it you? — Hey nonny nonny nonny no!

Marshall Barer, Once Upon a Mattress

A touch is an instance of the user putting a finger on the screen. The system and the hardware, working together, know when a finger contacts the screen and where it is. Fingers are fat, but the system and the hardware cleverly reduce the finger’s location to a single appropriate point.

A UIView, by virtue of being a UIResponder, is the visible locus of touches. There are other UIResponder subclasses, but none of them is visible on the screen. What the user sees are views; what the user is touching are views. (The user may also see layers, but a layer is not a UIResponder and is not involved with touches. I’ll talk later about how to make it seem as if the user can touch a layer.)

It would make sense, therefore, if every touch were reported directly to the view in which it occurred. However, what the system “sees” is not particular views but an app as a whole. So a touch is represented as an object (a UITouch instance) which is bundled up in an envelope (a UIEvent) which the system delivers to your app. It is then up to your app to deliver the envelope to an appropriate UIView. In the vast majority of cases, this will happen automatically the way you expect, and you will respond to a touch by way of the view in which the touch occurred.

In fact, usually you won’t concern yourself with UIEvents and UITouches at all. Most built-in interface views deal with these low-level touch reports themselves, and notify your code at a higher level. When a UIButton emits an action message to report a control event such as Touch Up Inside (Chapter 11), it has already performed a reduction of a complex sequence of touches (“the user put a finger down inside me and then, possibly with some dragging hither and yon, raised it when it was still reasonably close to me”). A UITextField reports touches on the keyboard as changes in its own text. A UITableView reports that the user selected a cell. A UIScrollView, when dragged, reports that it scrolled; when pinched outward, it reports that it zoomed.

Nevertheless, it is useful to know how to respond to touches directly, so that you can implement your own touchable views, and so that you understand what Cocoa’s built-in views are actually doing. This chapter discusses touch detection and response by views (and other UIResponders) at their lowest level, along with a slightly higher-level mechanism, gesture recognizers, that categorizes touches into gesture types for you; then it deconstructs the touch-delivery architecture by which touches are reported to your views in the first place.

Touch Events and Views

Imagine a screen that the user is not touching at all: the screen is “finger-free.” Now the user touches the screen with one or more fingers. From that moment to the time the screen is once again finger-free, all touches and finger movements together constitute what Apple calls a single multitouch sequence.

The system reports to your app, during a given multitouch sequence, every change in finger configuration, so that your app can figure out what the user is doing. Every such report is a UIEvent. In fact, every report having to do with the same multitouch sequence is the same UIEvent instance, arriving repeatedly, each time there’s a change in finger configuration.

Every UIEvent reporting a change in the user’s finger configuration contains one or more UITouch objects. Each UITouch object corresponds to a single finger; conversely, every finger touching the screen is represented in the UIEvent by a UITouch object. Once a certain UITouch instance has been created to represent a finger that has touched the screen, the same UITouch instance is used to represent that finger throughout this multitouch sequence until the finger leaves the screen.

Now, it might sound as if the system has to bombard the app with huge numbers of reports constantly during a multitouch sequence. But that’s not really true. The system needs to report only changes in the finger configuration. For a given UITouch object (representing, remember, a specific finger), only four things can happen. These are called touch phases, and are described by a UITouch instance’s phase property:

UITouchPhaseBegan
The finger touched the screen for the first time; this UITouch instance has just been created. This is always the first phase, and arrives only once.
UITouchPhaseMoved
The finger moved upon the screen.
UITouchPhaseStationary
The finger remained on the screen without moving. Why is it necessary to report this? Well, remember, once a UITouch instance has been created, it must be present every time the UIEvent arrives. So if the UIEvent arrives because something else happened (e.g., a new finger touched the screen), we must report what this finger has been doing, even if it has been doing nothing.
UITouchPhaseEnded
The finger left the screen. Like UITouchPhaseBegan, this phase arrives only once. The UITouch instance will now be destroyed and will no longer appear in UIEvents for this multitouch sequence.

Those four phases are sufficient to describe everything that a finger can do. Actually, there is one more possible phase:

UITouchPhaseCancelled
The system has aborted this multitouch sequence because something interrupted it.

What might interrupt a multitouch sequence? There are many possibilities. Perhaps the user clicked the Home button or the screen lock button in the middle of the sequence. A local notification alert may have appeared (Chapter 26); on an actual iPhone, a call might have come in. (As we shall see, a gesture recognizer recognizing its gesture may also trigger touch cancellation.) The point is, if you’re dealing with touches yourself, you cannot afford to ignore touch cancellation; they are your opportunity to get things into a coherent state when the sequence is interrupted.

When a UITouch first appears (UITouchPhaseBegan), your app works out which UIView it is associated with. (I’ll give full details, later in this chapter, as to how it does that.) This view is then set as the touch’s view property; from then on, this UITouch is always associated with this view. In other words, a touch’s view is that touch’s view forever (until that finger leaves the screen).

The same UIEvent containing the same UITouches can be sent to multiple views; after all, these are programmatic objects, not real-world envelopes containing actual fingers. Accordingly, a UIEvent is distributed to all the views of all the UITouches it contains. Conversely, if a view is sent a UIEvent, it’s because that UIEvent contains at least one UITouch whose view is this view.

If every UITouch in a UIEvent associated with a certain UIView has the phase UITouchPhaseStationary, that UIEvent is not sent to that UIView. There’s no point, because as far as that view is concerned, nothing happened.

Receiving Touches

A UIResponder, and therefore a UIView, has four methods corresponding to the four UITouch phases that require UIEvent delivery. A UIEvent is delivered to a view by calling one or more of these four methods (the touches... methods):

touchesBegan:withEvent:
A finger touched the screen, creating a UITouch.
touchesMoved:withEvent:
A finger previously reported to this view with touchesBegan:withEvent: has moved.
touchesEnded:withEvent:
A finger previously reported to this view with touchesBegan:withEvent: has left the screen.
touchesCancelled:withEvent:
We are bailing out on a finger previously reported to this view with touchesBegan:withEvent:.

The parameters of these methods are:

The relevant touches
These are the event’s touches whose phase corresponds to the name of the method and (normally) whose view is this view. They arrive as an NSSet (Chapter 10). If you know for a fact that there is only one touch in the set, or that any touch in the set will do, you can retrieve it with anyObject (an NSSet doesn’t implement lastObject because a set is unordered).
The event
This is the UIEvent instance. It contains its touches as an NSSet, which you can retrieve with the allTouches message. This means all the event’s touches, including but not necessarily limited to those in the first parameter; there might be touches in a different phase or intended for some other view. You can call touchesForView: or touchesForWindow: to ask for the set of touches associated with a particular view or window.

A UITouch has some useful methods and properties:

locationInView:, previousLocationInView:
The current and previous location of this touch with respect to the coordinate system of a given view. The view you’ll be interested in will often be self or self.superview; supply nil to get the location with respect to the window. The previous location will be of interest only if the phase is UITouchPhaseMoved.
timestamp
When the touch last changed. A touch is timestamped when it is created (UITouchPhaseBegan) and each time it moves (UITouchPhaseMoved).
tapCount
If two touches are in roughly the same place in quick succession, and the first one is brief, the second one may be characterized as a repeat of the first. They are different touch objects, but the second will be assigned a tapCount one larger than the previous one. The default is 1, so if (for example) a touch’s tapCount is 3, then this is the third tap in quick succession in roughly the same spot.
view
The view with which this touch is associated.

Here are some additional UIEvent properties:

type
This will be UIEventTypeTouches. There are other event types, but you’re not going to receive any of them this way.
timestamp
When the event occurred.

So, when we say that a certain view is receiving a touch, that is a shorthand expression meaning that it is being sent a UIEvent containing this UITouch, over and over, by calling one of its touches... methods, corresponding to the phase this touch is in, from the time the touch is created until the time it is destroyed.

Restricting Touches

Touch events can be turned off entirely at the application level with UIApplication’s beginIgnoringInteractionEvents. It is quite common to do this during animations and other lengthy operations during which responding to a touch could cause undesirable results. This call should be balanced by endIgnoringInteractionEvents. Pairs can be nested, in which case interactivity won’t be restored until the outermost endIgnoringInteractionEvents has been reached.

A number of UIView properties also restrict the delivery of touches to particular views:

userInteractionEnabled
If set to NO, this view (along with its subviews) is excluded from receiving touches. Touches on this view or one of its subviews “fall through” to a view behind it.
alpha
If set to 0.0 (or extremely close to it), this view (along with its subviews) is excluded from receiving touches. Touches on this view or one of its subviews “fall through” to a view behind it.
hidden
If set to YES, this view (along with its subviews) is excluded from receiving touches. This makes sense, since from the user’s standpoint, the view and its subviews are not even present.
multipleTouchEnabled
If set to NO, this view never receives more than one touch simultaneously; once it receives a touch, it doesn’t receive any other touches until that first touch has ended.
exclusiveTouch
This is the only one of these properties that can’t be set in the nib. An exclusiveTouch view receives a touch only if no other views in the same window have touches associated with them; once an exclusiveTouch view has received a touch, then while that touch exists no other view in the same window receives any touches.

Note

A UIWindow ignores multipleTouchEnabled; it always receives multiple touches. Moreover, a UIWindow’s behavior with respect to exclusiveTouch is unreliable, presumably because it is not itself a view in the window. In general this should not be an issue, since you’ll always have a root view covering the window anyway.

Interpreting Touches

Thanks to the existence of gesture recognizers (discussed later in this chapter), in most cases you won’t have to interpret touches at all; you’ll let a gesture recognizer do most of that work. Even so, it is beneficial to be conversant with the nature of touch interpretation; this will help you interact with a gesture recognizer, write your own gesture recognizer, or subclass an existing one. Furthermore, not every touch sequence can be codified through a gesture recognizer; sometimes, directly interpreting touches is the best approach.

To figure out what’s going on as touches are received by a view, your code must essentially function as a kind of state machine. You’ll receive various touches... method calls, and your response will partly depend upon what happened previously, so you’ll have to record somehow, such as in instance variables, the information that you’ll need in order to decide what to do when the next touches... method is called. Such an architecture can make writing and maintaining touch-analysis code quite tricky. Moreover, although you can distinguish a particular UITouch or UIEvent object over time by keeping a reference to it, you mustn’t retain that reference; it doesn’t belong to you.

To illustrate the business of interpreting touches, we’ll start with a view that can be dragged with the user’s finger. For simplicity, I’ll assume that this view receives only a single touch at a time. (This assumption is easy to enforce by setting the view’s multipleTouchEnabled to NO, which is the default.)

The trick to making a view follow the user’s finger is to realize that a view is positioned by its center, which is in superview coordinates, but the user’s finger might not be at the center of the view. So at every stage of the drag we must change the view’s center by the change in the user’s finger position in superview coordinates:

- (void) touchesMoved:(NSSet *)touches withEvent:(UIEvent *)event {
    CGPoint loc =
        [[touches anyObject] locationInView: self.superview];
    CGPoint oldP =
        [[touches anyObject] previousLocationInView: self.superview];
    CGFloat deltaX = loc.x - oldP.x;
    CGFloat deltaY = loc.y - oldP.y;
    CGPoint c = self.center;
    c.x += deltaX;
    c.y += deltaY;
    self.center = c;
}

Next, let’s add a restriction that the view can be dragged only vertically or horizontally. All we have to do is hold one coordinate steady; but which coordinate? Everything seems to depend on what the user does initially. So we’ll do a one-time test the first time we receive touchesMoved:withEvent:. Now we’re maintaining two state variables, _decided and _horiz:

- (void) touchesBegan:(NSSet *)touches withEvent:(UIEvent *)event {
    self->_decided = NO;
}

- (void) touchesMoved:(NSSet *)touches withEvent:(UIEvent *)event {
    if (!self->_decided) {
        self->_decided = YES;
        CGPoint then = [[touches anyObject] previousLocationInView: self];
        CGPoint now = [[touches anyObject] locationInView: self];
        CGFloat deltaX = fabs(then.x - now.x);
        CGFloat deltaY = fabs(then.y - now.y);
        self->_horiz = (deltaX >= deltaY);
    }
    CGPoint loc =
        [[touches anyObject] locationInView: self.superview];
    CGPoint oldP =
        [[touches anyObject] previousLocationInView: self.superview];
    CGFloat deltaX = loc.x - oldP.x;
    CGFloat deltaY = loc.y - oldP.y;
    CGPoint c = self.center;
    if (self->_horiz)
        c.x += deltaX;
    else
        c.y += deltaY;
    self.center = c;
}

Look at how things are trending. We are maintaining state variables, which we are managing across multiple methods, and we are subdividing a touches... method implementation into tests depending on the state of our state machine. Our state machine is very simple, involving just two state variables, but already our code is becoming difficult to read and to maintain. Things only become more messy as we try to make our view’s behavior more sophisticated.

Another area in which manual touch handling can rapidly prove overwhelming is when it comes to distinguishing between different gestures that the user is to be permitted to perform on a view. Imagine, for example, a view that distinguishes between a finger tapping briefly and a finger remaining down for a longer time. We can’t know how long a tap is until it’s over, so one approach might be to wait until then before deciding:

- (void) touchesBegan:(NSSet *)touches withEvent:(UIEvent *)event {
    self->_time = [[touches anyObject] timestamp];
}

- (void) touchesEnded:(NSSet *)touches withEvent:(UIEvent *)event {
    NSTimeInterval diff = event.timestamp - self->_time;
    if (diff < 0.4)
        NSLog(@"short");
    else
        NSLog(@"long");
}

On the other hand, one might argue that if a tap hasn’t ended after some set time (here, 0.4 seconds), we know that it is long, and so we could begin responding to it without waiting for it to end. The problem is that we don’t automatically get an event after 0.4 seconds. So we’ll create one, using delayed performance:

- (void) touchesBegan:(NSSet *)touches withEvent:(UIEvent *)event {
    self->_time = [[touches anyObject] timestamp];
    [self performSelector:@selector(touchWasLong)
               withObject:nil afterDelay:0.4];
}

- (void) touchesEnded:(NSSet *)touches withEvent:(UIEvent *)event {
    NSTimeInterval diff = event.timestamp - self->_time;
    if (diff < 0.4)
        NSLog(@"short");
}

- (void) touchWasLong {
    NSLog(@"long");
}

But there’s a bug. If the tap is short, we report that it was short, but we also report that it was long. That’s because the delayed call to touchWasLong arrives anyway. We could use some sort of boolean flag to tell us when to ignore that call, but there’s a better way: NSObject has a class method that lets us cancel any pending delayed performance calls. So:

- (void) touchesBegan:(NSSet *)touches withEvent:(UIEvent *)event {
    self->_time = [[touches anyObject] timestamp];
    [self performSelector:@selector(touchWasLong)
               withObject:nil afterDelay:0.4];
}

- (void) touchesEnded:(NSSet *)touches withEvent:(UIEvent *)event {
    NSTimeInterval diff = event.timestamp - self->_time;
    if (diff < 0.4) {
        NSLog(@"short");
        [NSObject cancelPreviousPerformRequestsWithTarget:self
                                     selector:@selector(touchWasLong)
                                       object:nil];
    }
}

- (void) touchWasLong {
    NSLog(@"long");
}

Here’s another use of the same technique. We’ll distinguish between a single tap and a double tap. The UITouch tapCount property already makes this distinction, but that, by itself, is not enough to help us react differently to the two. What we must do, having received a tap whose tapCount is 1, is to delay responding to it long enough to give a second tap a chance to arrive. This is unfortunate, because it means that if the user intends a single tap, some time will elapse before anything happens in response to it; however, there’s nothing we can easily do about that.

Distributing our various tasks correctly is a bit tricky. We know when we have a double tap as early as touchesBegan:withEvent:, so that’s when we cancel our delayed response to a single tap, but we respond to the double tap in touchesEnded:withEvent:. We don’t start our delayed response to a single tap until touchesEnded:withEvent:, because what matters is the time between the taps as a whole, not between the starts of the taps. This code is adapted from Apple’s own example:

- (void) touchesBegan:(NSSet *)touches withEvent:(UIEvent *)event {
    int ct = [[touches anyObject] tapCount];
    if (ct == 2) {
        [NSObject cancelPreviousPerformRequestsWithTarget:self
                                                 selector:@selector(singleTap)
                                                   object:nil];
    }
}

- (void) touchesEnded:(NSSet *)touches withEvent:(UIEvent *)event {
    int ct = [[touches anyObject] tapCount];
    if (ct == 1)
        [self performSelector:@selector(singleTap)
                   withObject:nil afterDelay:0.3];
    if (ct == 2)
        NSLog(@"double tap");
}

- (void) singleTap {
    NSLog(@"single tap");
}

Now let’s consider combining our detection for a single or double tap with our earlier code for dragging a view horizontally or vertically. This is to be a view that can detect four kinds of gesture: a single tap, a double tap, a horizontal drag, and a vertical drag. We must include the code for all possibilities and make sure they don’t interfere with each other. The result is rather horrifying, a forced join between two already complicated sets of code, along with an additional pair of state variables to track the decision between the tap gestures on the one hand and the drag gestures on the other:

- (void) touchesBegan:(NSSet *)touches withEvent:(UIEvent *)event {
    // be undecided
    self->_decidedTapOrDrag = NO;
    // prepare for a tap
    int ct = [[touches anyObject] tapCount];
    if (ct == 2) {
        [NSObject cancelPreviousPerformRequestsWithTarget:self
                                                 selector:@selector(singleTap)
                                                   object:nil];
        self->_decidedTapOrDrag = YES;
        self->_drag = NO;
        return;
    }
    // prepare for a drag
    self->_decidedDirection = NO;
}

- (void) touchesMoved:(NSSet *)touches withEvent:(UIEvent *)event {
    if (self->_decidedTapOrDrag && !self->_drag)
        return;
    self->_decidedTapOrDrag = YES;
    self->_drag = YES;
    if (!self->_decidedDirection) {
        self->_decidedDirection = YES;
        CGPoint then = [[touches anyObject] previousLocationInView: self];
        CGPoint now = [[touches anyObject] locationInView: self];
        CGFloat deltaX = fabs(then.x - now.x);
        CGFloat deltaY = fabs(then.y - now.y);
        self->_horiz = (deltaX >= deltaY);
    }
    CGPoint loc =
        [[touches anyObject] locationInView: self.superview];
    CGPoint oldP =
        [[touches anyObject] previousLocationInView: self.superview];
    CGFloat deltaX = loc.x - oldP.x;
    CGFloat deltaY = loc.y - oldP.y;
    CGPoint c = self.center;
    if (self->_horiz)
        c.x += deltaX;
    else
        c.y += deltaY;
    self.center = c;
}

- (void) touchesEnded:(NSSet *)touches withEvent:(UIEvent *)event {
    if (!self->_decidedTapOrDrag || !self->_drag) {
        // end for a tap
        int ct = [[touches anyObject] tapCount];
        if (ct == 1)
            [self performSelector:@selector(singleTap) withObject:nil
                       afterDelay:0.3];
        if (ct == 2)
            NSLog(@"double tap");
        return;
    }
}

- (void) singleTap {
    NSLog(@"single tap");
}

That code seems to work, but it’s hard to say whether it covers all possibilities coherently; it’s barely legible and the logic borders on the mysterious. This is the kind of situation for which gesture recognizers were devised.

Gesture Recognizers

Writing and maintaining a state machine that interprets touches across a combination of three or four touches... methods is hard enough when a view confines itself to expecting only one kind of gesture, such as dragging. It becomes even more involved when a view wants to accept and respond differently to different kinds of gesture. Furthermore, many types of gesture are conventional and standard; it seems insane to require developers to implement independently the elements that constitute what is, in effect, a universal vocabulary.

The solution is gesture recognizers, which standardize common gestures and allow the code for different gestures to be separated and encapsulated into different objects.

Gesture Recognizer Classes

A gesture recognizer (a subclass of UIGestureRecognizer) is an object attached to a UIView, which has for this purpose methods addGestureRecognizer: and removeGestureRecognizer:, and a gestureRecognizers property. A UIGestureRecognizer implements the four touches... handlers, but it is not a responder (a UIResponder), so it does not participate in the responder chain.

If a new touch is going to be delivered to a view, it is also associated with and delivered to that view’s gesture recognizers if it has any, and that view’s superview’s gesture recognizers if it has any, and so on up the view hierarchy. Thus, the place of a gesture recognizer in the view hierarchy matters, even though it isn’t part of the responder chain.

UITouch and UIEvent provide complementary ways of learning how touches and gesture recognizers are associated. UITouch’s gestureRecognizers lists the gesture recognizers that are currently handling this touch. UIEvent’s touchesForGestureRecognizer: lists the touches that are currently being handled by a particular gesture recognizer.

Each gesture recognizer maintains its own state as touch events arrive, building up evidence as to what kind of gesture this is. When one of them decides that it has recognized its own type of gesture, it emits either a single message (to indicate, for example, that a finger has tapped) or a series of messages (to indicate, for example, that a finger is moving); the distinction here is between a discrete and a continuous gesture. What message a gesture recognizer emits, and to what object it sends it, is set through a target–action dispatch table attached to the gesture recognizer; a gesture recognizer is rather like a UIControl (Chapter 11) in this regard. Indeed, one might say that a gesture recognizer simplifies the touch handling of any view to be like that of a control. The difference is that one control may report several different control events, whereas each gesture recognizer reports only one gesture type, with different gestures being reported by different gesture recognizers. This architecture implies that it is unnecessary to subclass UIView merely in order to implement touch analysis.

UIGestureRecognizer itself is abstract, providing methods and properties to its subclasses. Among these are:

initWithTarget:action:

The designated initializer. Each message emitted by a UIGestureRecognizer is simply a matter of sending the action message to the target. Further target–action pairs may be added with addTarget:action: and removed with removeTarget:action:.

Two forms of selector are possible: either there is no parameter, or there is a single parameter which will be the gesture recognizer. Most commonly, you’ll use the second form, so that the target can identify and query the gesture recognizer; moreover, using the second form also gives the target a reference to the view, because the gesture recognizer provides a reference to its view as the view property.

locationOfTouch:inView:
The touch is specified by an index number. The numberOfTouches property provides a count of current touches; the touches themselves are inaccessible from outside the gesture recognizer.
enabled
A convenient way to turn a gesture recognizer off without having to remove it from its view.
state, view
I’ll discuss state later on. The view is the view to which this gesture recognizer is attached.

Built-in UIGestureRecognizer subclasses are provided for six common gesture types: tap, pinch, pan (drag), swipe, rotate, and long press. These embody properties and methods likely to be needed for each type of gesture, either in order to configure the gesture recognizer beforehand or in order to query it as to the state of an ongoing gesture:

UITapGestureRecognizer (discrete)
Configuration: numberOfTapsRequired, numberOfTouchesRequired (“touches” means simultaneous fingers).
UIPinchGestureRecognizer (continuous)
State: scale, velocity.
UIRotationGestureRecognizer (continuous)
State: rotation, velocity.
UISwipeGestureRecognizer (discrete)
Configuration: direction (meaning permitted directions, a bitmask), numberOfTouchesRequired.
UIPanGestureRecognizer (continuous)

Configuration: minimumNumberOfTouches, maximumNumberOfTouches.

State: translationInView:, setTranslation:inView:, and velocityInView:; the coordinate system of the specified view is used, so to follow a finger you’ll use the superview of the view being dragged, just as we did in the examples earlier.

UILongPressGestureRecognizer (continuous)
Configuration: numberOfTapsRequired, numberOfTouchesRequired, minimumPressDuration, allowableMovement. The numberOfTapsRequired is the count of taps before the tap that stays down; so it can be 0 (the default). The allowableMovement setting lets you compensate for the fact that the user’s finger is unlikely to remain steady during an extended press; thus we need to provide some limit before deciding that this gesture is, say, a drag, and not a long press after all. On the other hand, once the long press is recognized, the finger is permitted to drag.

UIGestureRecognizer also provides a locationInView: method. This is a single point, even if there are multiple touches. The subclasses implement this variously. For example, for UIPanGestureRecognizer, the location is where the touch is if there’s a single touch, but it’s a sort of midpoint (“centroid”) if there are multiple touches.

We already know enough to implement, using a gesture recognizer, a view that responds to a single tap, or a view that responds to a double tap. We don’t yet know quite enough to implement a view that lets itself be dragged around, or a view that can respond to more than one gesture; we’ll come to that. Meanwhile, here’s code that implements a view that responds to a single tap:

UITapGestureRecognizer* t = [[UITapGestureRecognizer alloc]
                            initWithTarget:self
                            action:@selector(singleTap)];
[v addGestureRecognizer:t];
// ...
- (void) singleTap {
    NSLog(@"single");
}

And here’s code that implements a view that responds to a double tap:

UITapGestureRecognizer* t = [[UITapGestureRecognizer alloc]
                             initWithTarget:self
                             action:@selector(doubleTap)];
t.numberOfTapsRequired = 2;
[v addGestureRecognizer:t];
// ...
- (void) doubleTap {
    NSLog(@"double");
}

For a continuous gesture like dragging, we need to know both when the gesture is in progress and when the gesture ends. This brings us to the subject of a gesture recognizer’s state.

A gesture recognizer implements a notion of states (the state property); it passes through these states in a definite progression. The gesture recognizer remains in the Possible state until it can make a decision one way or the other as to whether this is in fact the correct gesture. The documentation neatly lays out the possible progressions:

Wrong gesture
Possible → Failed. No action message is sent.
Discrete gesture (like a tap), recognized
Possible → Ended. One action message is sent, when the state changes to Ended.
Continuous gesture (like a drag), recognized
Possible → Began → Changed (repeatedly) → Ended. Action messages are sent once for Began, as many times as necessary for Changed, and once for Ended.
Continuous gesture, recognized but later cancelled
Possible → Began → Changed (repeatedly) → Cancelled. Action messages are sent once for Began, as many times as necessary for Changed, and once for Cancelled.

The actual state names are UIGestureRecognizerStatePossible and so forth. The name UIGestureRecognizerStateRecognized is actually a synonym for the Ended state; I find this unnecessary and confusing and I’ll ignore it in my discussion.

We now know enough to implement, using a gesture recognizer, a view that lets itself be dragged around in any direction by a single finger. Our maintenance of state is greatly simplified, because a UIPanGestureRecognizer maintains a delta (translation) for us. This delta, available using translationInView:, is reckoned from the touch’s initial position. So we need to store our center only once:

UIPanGestureRecognizer* p =
    [[UIPanGestureRecognizer alloc] initWithTarget:self
                                            action:@selector(dragging:)];
[v addGestureRecognizer:p];
// ...
- (void) dragging: (UIPanGestureRecognizer*) p {
    UIView* vv = p.view;
    if (p.state == UIGestureRecognizerStateBegan)
        self->_origC = vv.center;
    CGPoint delta = [p translationInView: vv.superview];
    CGPoint c = self->_origC;
    c.x += delta.x; c.y += delta.y;
    vv.center = c;
}

Actually, it’s possible to write that code without maintaining any state at all, because we are allowed to reset the UIPanGestureRecognizer’s delta, using setTranslation:inView:. So:

- (void) dragging: (UIPanGestureRecognizer*) p {
    UIView* vv = p.view;
    if (p.state == UIGestureRecognizerStateBegan ||
            p.state == UIGestureRecognizerStateChanged) {
        CGPoint delta = [p translationInView: vv.superview];
        CGPoint c = vv.center;
        c.x += delta.x; c.y += delta.y;
        vv.center = c;
        [p setTranslation: CGPointZero inView: vv.superview];
    }
}

A gesture recognizer also works, as I’ve already mentioned, if it is attached to the superview (or further up the hierarchy) of the view in which the user gestures. For example, if a tap gesture recognizer is attached to the window’s root view, the user can tap on any other view, and the tap will be recognized; the other view’s mere presence does not “block” the root view’s gesture recognizer from recognizing the gesture, even if it is a UIControl that responds autonomously to touches.

This behavior comes as a surprise to beginners, but it makes sense, because if it were not the case, certain gestures would be impossible. Imagine, for example, a pair of views on each of which the user can tap individually, but which the user can also touch simultaneously (one finger on each view) and rotate together around their mutual centroid. Neither view can detect the rotation qua rotation, because neither view receives both touches; only the superview can detect it, so the fact that the views themselves respond to touches must not prevent the superview’s gesture recognizer from operating.

Suppose, then, that your window’s root view has a UITapGestureRecognizer attached to it (perhaps because you want to be able to recognize taps on the background), but there is also a UIButton within it. How is that gesture recognizer to ignore a tap on the button? A UIView instance method introduced in iOS 6 solves the problem: gestureRecognizerShouldBegin:. Its parameter is a gesture recognizer belonging to this view or to a view further up the view hierarchy. That gesture recognizer has recognized its gesture as taking place in this view; but by returning NO, the view can tell the gesture recognizer to bow out and do nothing, not sending any action messages, and permitting this view to respond to the touch as if the gesture recognizer weren’t there.

Thus, for example, a UIButton could return NO for a single tap UITapGestureRecognizer; a single tap on the button would then trigger the button’s action message, not the gesture recognizer’s action message. And in fact a UIButton, by default, does return NO for a single tap UITapGestureRecognizer whose view is not the UIButton itself. (If the gesture recognizer is for some gesture other than a tap, then the problem never arises, because a tap on the button won’t cause the gesture recognizer to recognize in the first place.) Other built-in controls may also implement gestureRecognizerShouldBegin: in such a way as to prevent accidental interaction with a gesture recognizer; the documentation says that a UISlider implements it in such a way that a UISwipeGestureRecognizer won’t prevent the user from sliding the “thumb,” and there may be other cases that aren’t documented explicitly. Naturally, you can take advantage of this feature in your own UIView subclasses.

Warning

Remember that this automatic behavior of built-in controls is new in iOS 6. If you write code intended to be backwards-compatible to iOS 5 or before, beware of unexpected interactions between gesture recognizers and controls.

Another way of resolving possible conflicts between a control and a gesture recognizer is through the gesture recognizer’s delegate, which I’ll discuss later in this chapter.

Multiple Gesture Recognizers

The question naturally arises of what happens when multiple gesture recognizers are in play. This isn’t a matter merely of multiple recognizers attached to a single view, because, as I have just said, if a view is touched, not only its own gesture recognizers but any gesture recognizers attached to views further up the view hierarchy are also in play, simultaneously. I like to think of a view as surrounded by a swarm of gesture recognizers — its own and those of its superview (and so on). In reality, it is a touch that has a swarm of gesture recognizers; that’s why a UITouch has a gestureRecognizers property, in the plural.

In general, once a gesture recognizer succeeds in recognizing its gesture, any other gesture recognizers associated with its touches are forced into the Failed state, and whatever touches were associated with those gesture recognizers are no longer sent to them; in effect, the first gesture recognizer in a swarm that recognizes its gesture owns the gesture, and those touches, from then on.

In many cases, this behavior alone will correctly eliminate conflicts. For example, we can add both our UITapGestureRecognizer for a single tap and our UIPanGestureRecognizer to a view and everything will just work.

What happens if we also add the UITapGestureRecognizer for a double tap? Dragging works, and single tap works; double tap works too, but without preventing the single tap from working. So, on a double tap, both the single tap action handler and the double tap action handler are called.

If that isn’t what we want, we don’t have to use delayed performance, as we did earlier. Instead, we can create a dependency between one gesture recognizer and another, telling the first to suspend judgement until the second has decided whether this is its gesture, by sending the first the requireGestureRecognizerToFail: message. This message doesn’t mean “force this other recognizer to fail”; it means, “you can’t succeed until this other recognizer fails.”

So our view is now configured as follows:

UITapGestureRecognizer* t2 = [[UITapGestureRecognizer alloc]
                              initWithTarget:self
                              action:@selector(doubleTap)];
t2.numberOfTapsRequired = 2;
[v addGestureRecognizer:t2];

UITapGestureRecognizer* t1 = [[UITapGestureRecognizer alloc]
                              initWithTarget:self
                              action:@selector(singleTap)];
[t1 requireGestureRecognizerToFail:t2];
[v addGestureRecognizer:t1];

UIPanGestureRecognizer* p = [[UIPanGestureRecognizer alloc]
                             initWithTarget:self
                             action:@selector(dragging:)];
[v addGestureRecognizer:p];

Note

Apple would prefer, if you’re going to have a view respond both to single tap and double tap, that you not make the former wait upon the latter (because this delays your response after the single tap). Rather, they would like you to arrange things so that it doesn’t matter that you respond to a single tap that is the first tap of a double tap. This isn’t always feasible, of course; Apple’s own Mobile Safari is a clear counterexample.

Subclassing Gesture Recognizers

To subclass a built-in gesture recognizer subclass, you must do the following things:

  • At the start of the implementation file, import <UIKit/UIGestureRecognizerSubclass.h>. This file contains a category on UIGestureRecognizer that allows you to set the gesture recognizer’s state (which is otherwise read-only), along with declarations for the methods you may need to override.
  • Override any touches... methods you need to (as if the gesture recognizer were a UIResponder); you will almost certainly call super so as to take advantage of the built-in behavior. In overriding a touches... method, you need to think like a gesture recognizer. As these methods are called, a gesture recognizer is setting its state; you must interact with that process.

To illustrate, we will subclass UIPanGestureRecognizer so as to implement a view that can be moved only horizontally or vertically. Our strategy will be to make two UIPanGestureRecognizer subclasses — one that allows only horizontal movement, and another that allows only vertical movement. They will make their recognition decisions in a mutually exclusive manner, so we can attach an instance of each to our view. This separates the decision-making logic in a gorgeously encapsulated object-oriented manner — a far cry from the spaghetti code we wrote earlier to do this same task.

I will show only the code for the horizontal drag gesture recognizer, because the vertical recognizer is symmetrically identical. We maintain just one instance variable, _origLoc, which we will use once to determine whether the user’s initial movement is horizontal. We override touchesBegan:withEvent: to set our instance variable with the first touch’s location:

- (void) touchesBegan:(NSSet *)touches withEvent:(UIEvent *)event {
    self->_origLoc = [[touches anyObject] locationInView:self.view.superview];
    [super touchesBegan: touches withEvent: event];
}

We then override touchesMoved:withEvent:; all the recognition logic is here. This method will be called for the first time with the state still at Possible. At that moment, we look to see if the user’s movement is more horizontal than vertical. If it isn’t, we set the state to Failed. But if it is, we just step back and let the superclass do its thing:

- (void) touchesMoved:(NSSet *)touches withEvent:(UIEvent *)event {
    if (self.state == UIGestureRecognizerStatePossible) {
        CGPoint loc = [[touches anyObject] locationInView:self.view.superview];
        CGFloat deltaX = fabs(loc.x - self->_origLoc.x);
        CGFloat deltaY = fabs(loc.y - self->_origLoc.y);
        if (deltaY >= deltaX)
            self.state = UIGestureRecognizerStateFailed;
    }
    [super touchesMoved: touches withEvent:event];
}

We now have a view that moves only if the user’s initial gesture is horizontal. But that isn’t the entirety of what we want; we want a view that, itself, moves horizontally only. To implement this, we’ll simply lie to our client about where the user’s finger is, by overriding translationInView::

- (CGPoint)translationInView:(UIView *)v {
    CGPoint proposedTranslation = [super translationInView:v];
    proposedTranslation.y = 0;
    return proposedTranslation;
}

That example was simple, because we subclassed a fully functional built-in UIGestureRecognizer subclass. If you were to write your own UIGestureRecognizer subclass entirely from scratch, there would be more work to do:

  • You should definitely implement all four touches... handlers. Their job, at a minimum, is to advance the gesture recognizer through the canonical progression of its states. When the first touch arrives at a gesture recognizer, its state will be Possible; you never explicitly set the recognizer’s state to Possible yourself. As soon as you know this can’t be our gesture, you set the state to Failed (Apple says that a gesture recognizer should “fail early, fail often”). If the gesture gets past all the failure tests, you set the state instead either to Ended (for a discrete gesture) or to Began (for a continuous gesture); if Began, then you might set it to Changed, and ultimately you must set it to Ended. Action messages will be sent automatically at the appropriate moments.
  • You should probably implement reset. This is called after you reach the end of the progression of states to notify you that the gesture recognizer’s state is about to be set back to Possible; it is your chance to return your state machine to its starting configuration (resetting instance variables, for example).

Keep in mind that your gesture recognizer might stop receiving touches without notice. Just because it gets a touchesBegan:withEvent: call for a particular touch doesn’t mean it will ever get touchesEnded:withEvent: for that touch. If your gesture recognizer fails to recognize its gesture, either because it declares failure or because it is still in the Possible state when another gesture recognizer recognizes, it won’t get any more touches... calls for any of the touches that were being sent to it. This is why reset is so important; it’s the one reliable signal that it’s time to clean up and get ready to receive the beginning of another possible gesture.

Gesture Recognizer Delegate

A gesture recognizer can have a delegate, which can perform two types of task:

Block a gesture recognizer’s operation

gestureRecognizerShouldBegin: is sent to the delegate before the gesture recognizer passes out of the Possible state; return NO to force the gesture recognizer to proceed to the Failed state. (This happens after gestureRecognizerShouldBegin: has been sent to the view in which the touch took place. That view must not have returned NO, or we wouldn’t have reached this stage.)

gestureRecognizer:shouldReceiveTouch: is sent to the delegate before a touch is sent to the gesture recognizer’s touchesBegan:... method; return NO to prevent that touch from ever being sent to the gesture recognizer.

Mediate simultaneous gesture recognition
When a gesture recognizer is about to declare that it recognizes its gesture, gestureRecognizer:shouldRecognizeSimultaneouslyWithGestureRecognizer: is sent to the delegate of that gesture recognizer, if this declaration would force the failure of another gesture recognizer, and to the delegate of a gesture recognizer whose failure would be forced. Return YES to prevent that failure, thus allowing both gesture recognizers to operate simultaneously. For example, a view could respond to both a two-fingered pinch and a two-fingered pan, the one applying a scale transform, the other changing the view’s center.

As an example, we will use delegate messages to combine a UILongPressGestureRecognizer and a UIPanGestureRecognizer, as follows: the user must perform a tap-and-a-half (tap and hold) to “get the view’s attention,” which we will indicate by a pulsing animation on the view; then (and only then) the user can drag the view.

In keeping with encapsulation, the UILongPressGestureRecognizer’s handler will take care of starting and stopping the animation, and the UIPanGestureRecognizer’s handler will take care of the drag in the familiar manner:

- (void) longPress: (UILongPressGestureRecognizer*) lp {
    if (lp.state == UIGestureRecognizerStateBegan) {
        CABasicAnimation* anim =
            [CABasicAnimation animationWithKeyPath: @"transform"];
        anim.toValue =
            [NSValue valueWithCATransform3D:
                CATransform3DMakeScale(1.1, 1.1, 1)];
        anim.fromValue =
            [NSValue valueWithCATransform3D:CATransform3DIdentity];
        anim.repeatCount = HUGE_VALF;
        anim.autoreverses = YES;
        [lp.view.layer addAnimation:anim forKey:nil];
    }
    if (lp.state == UIGestureRecognizerStateEnded ||
        lp.state == UIGestureRecognizerStateCancelled) {
        [lp.view.layer removeAllAnimations];
    }
}

- (void) panning: (UIPanGestureRecognizer*) p {
    UIView* vv = p.view;
    if (p.state == UIGestureRecognizerStateBegan)
        self->_origC = vv.center;
    CGPoint delta = [p translationInView: vv.superview];
    CGPoint c = self->_origC;
    c.x += delta.x; c.y += delta.y;
    vv.center = c;
}

As we created our gesture recognizers, we kept a reference to the UILongPressGestureRecognizer (longPresser), and we made ourself the UIPanGestureRecognizer’s delegate. So we will receive delegate messages. If the UIPanGestureRecognizer tries to declare success while the UILongPressGestureRecognizer’s state is Failed or still at Possible, we prevent it. If the UILongPressGestureRecognizer succeeds, we permit the UIPanGestureRecognizer to operate as well:

- (BOOL) gestureRecognizerShouldBegin: (UIGestureRecognizer*) g {
    if (self.longPresser.state == UIGestureRecognizerStatePossible ||
        self.longPresser.state == UIGestureRecognizerStateFailed)
        return NO;
    return YES;
}
- (BOOL)gestureRecognizer: (UIGestureRecognizer*) g1
        shouldRecognizeSimultaneouslyWithGestureRecognizer:
            (UIGestureRecognizer*) g2 {
    return YES;
}

The result is that the view can be dragged only if it is pulsing; in effect, what we’ve done is to compensate, using delegate methods, for the fact that UIGestureRecognizer has no requireGestureRecognizerToSucceed: method.

You might object that that example is a bit artificial, because a UILongPressGestureRecognizer can implement draggability all on its own. Its Changed state indicates a drag; it lacks the convenient translationInView: method, but we know how to work around that. So here, for completeness, is the same behavior implemented using a single gesture recognizer and a single handler; although this is doable, I find the previous implementation more elegant and readable:

- (void) longPress: (UILongPressGestureRecognizer*) lp {
    UIView* vv = lp.view;
    if (lp.state == UIGestureRecognizerStateBegan) {
        CABasicAnimation* anim =
            [CABasicAnimation animationWithKeyPath: @"transform"];
        anim.toValue =
            [NSValue valueWithCATransform3D:
                CATransform3DMakeScale(1.1, 1.1, 1)];
        anim.fromValue =
            [NSValue valueWithCATransform3D:CATransform3DIdentity];
        anim.repeatCount = HUGE_VALF;
        anim.autoreverses = YES;
        [vv.layer addAnimation:anim forKey:nil];
        self->_origOffset =
            CGPointMake(CGRectGetMidX(vv.bounds) - [lp locationInView:vv].x,
            CGRectGetMidY(vv.bounds) - [lp locationInView:vv].y);
    }
    if (lp.state == UIGestureRecognizerStateChanged) {
        CGPoint c = [lp locationInView: vv.superview];
        c.x += self->_origOffset.x;
        c.y += self->_origOffset.y;
        vv.center = c;
    }
    if (lp.state == UIGestureRecognizerStateEnded ||
        lp.state == UIGestureRecognizerStateCancelled) {
        [vv.layer removeAllAnimations];
    }
}

If you are subclassing a gesture recognizer class, you can incorporate delegate-like behavior into the subclass. By overriding canPreventGestureRecognizer: and canBePreventedByGestureRecognizer:, you can mediate simultaneous gesture recognition at the class level. The built-in gesture recognizer subclasses already do this; that is why, for example, a single tap UITapGestureRecognizer does not, by recognizing its gesture, cause the failure of a double tap UITapGestureRecognizer.

You can also, in a gesture recognizer subclass, send ignoreTouch:forEvent: directly to a gesture recognizer (typically, to self). This has the same effect as the delegate method gestureRecognizer:shouldReceiveTouch: returning NO, blocking delivery of that touch to the gesture recognizer for as long as it exists. For example, if you’re in the middle of an already recognized gesture and a new touch arrives, you might well elect to ignore it.

Gesture Recognizers in the Nib

Instead of instantiating a gesture recognizer in code, you can create and configure it in a nib or storyboard. (I’m a bit hazy on what version of Xcode introduced this feature; I first noticed it in Xcode 4.5.) Drag a gesture recognizer from the Object library into the canvas. It becomes a top-level nib object. You can configure the gesture recognizer’s properties in the Attributes inspector. Control-drag from a view object (meaning an object whose class is UIView or any UIView subclass) to a gesture recognizer to make that gesture recognizer belong to that view; the view’s gestureRecognizers property is an array, so its gestureRecognizers outlet is an outlet collection (see Chapter 7) and you can add more than one gesture recognizer to a view in the nib.

A gesture recognizer’s target–action pair can be configured in the nib as well. This works just like configuring a target–action pair for a control (Chapter 7). As a hint to Xcode, the action method’s signature should return IBAction, and it should take a single parameter, which will be a reference to the gesture recognizer. You can then drag from the gesture recognizer, or from its Sent Actions “selector” listing in the Connections inspector, to that method in code in an assistant pane — or, if this method is in a known object’s class, such as the File’s Owner, you can drag directly to that object within the nib. (However, although a gesture recognizer has a full-fledged target–action dispatch table, only one target–action pair can be configured in the nib. This seems like a bug; after all, control configuration is not restricted in this way.)

A gesture recognizer in the nib also has a delegate outlet, which can be hooked to any object.

A view retains its gesture recognizers, so there will usually be no need for memory management on a gesture recognizer in the nib. It’s a full-fledged nib object, so you can make an outlet to it; you would do this, for instance, if you needed to send a requireGestureRecognizerToFail: message to a gesture recognizer early in its lifetime, as we did previously in order to mediate between a single tap recognizer and a double tap recognizer.

Touch Delivery

Let’s now return to the very beginning of the touch reporting process, when the system sends the app a UIEvent containing touches, and tease apart in full detail the entire procedure by which a touch is delivered to views and gesture recognizers:

  1. Whenever a new touch appears, the application calls the UIView instance method hitTest:withEvent: on the window, which returns the view (called, appropriately, the hit-test view) that will be permanently associated with this touch. This method uses the UIView instance method pointInside:withEvent: along with hitTest:withEvent: recursively down the view hierarchy to find the frontmost view containing the touch’s location and capable of receiving a touch. The logic of how a view’s userInteractionEnabled, hidden, and alpha affect its touchability is implemented at this stage.
  2. Each time the touch situation changes, the application calls its own sendEvent:, which in turn calls the window’s sendEvent:. The window delivers each of an event’s touches by calling the appropriate touches... method(s), as follows:

    1. As a touch first appears, it is initially delivered to the hit-test view’s swarm of gesture recognizers. It is then also delivered to that view. The logic of withholding touches in obedience to multipleTouchEnabled and exclusiveTouch is also implemented at this stage. For example, additional touches won’t be delivered to a view if that view currently has a touch and has multipleTouchEnabled set to NO.
    2. If a gesture is recognized by a gesture recognizer, then for any touch associated with this gesture recognizer:

      1. touchesCancelled:forEvent: is sent to the touch’s view, and the touch is no longer delivered to its view.
      2. If that touch was associated with any other gesture recognizer, that gesture recognizer is forced to fail.
    3. If a gesture recognizer fails, either because it declares failure or because it is forced to fail, its touches are no longer delivered to it, but (except as already specified) they continue to be delivered to their view.
    4. If a touch would be delivered to a view, but that view does not respond to the appropriate touches... method, a responder further up the responder chain (Chapter 11) is sought that does respond to it, and the touch is delivered there.

The rest of this chapter elaborates on each stage of this standard procedure, nearly every bit of which can be customized to some extent.

Hit-Testing

Hit-testing is the determination of what view the user touched. View hit-testing uses the UIView instance method hitTest:withEvent:, which returns either a view (the hit-test view) or nil. The idea is to find the frontmost view containing the touch point. This method uses an elegant recursive algorithm, as follows:

  1. A view’s hitTest:withEvent: first calls the same method on its own subviews, if it has any, because a subview is considered to be in front of its superview. The subviews are queried in reverse order, because that’s front-to-back order (Chapter 14): thus, if two sibling views overlap, the one in front reports the hit first.
  2. If, as a view hit-tests its subviews, any of those subviews responds by returning a view, it stops querying its subviews and immediately returns the view that was returned to it. Thus, the very first view to declare itself the hit-test view immediately percolates all the way to the top of the call chain and is the hit-test view.
  3. If, on the other hand, a view has no subviews, or if all of its subviews return nil (indicating that neither they nor their subviews was hit), then the view calls pointInside:withEvent: on itself. If this call reveals that the touch was inside this view, the view returns itself, declaring itself the hit-test view; otherwise it returns nil.

    No problem arises if a view has a transform, because pointInside:withEvent: takes the transform into account. That’s why a rotated button continues to work correctly.

It is also up to hitTest:withEvent: to implement the logic of touch restrictions exclusive to a view. If a view’s userInteractionEnabled is NO, or its hidden is YES, or its alpha is close to 0.0, it returns nil without hit-testing any of its subviews and without calling pointInside:withEvent:. Thus these restrictions do not, of themselves, exclude a view from being hit-tested; on the contrary, they operate precisely by modifying a view’s hit-test result.

However, hit-testing knows nothing about multipleTouchEnabled (which involves multiple touches) or exclusiveTouch (which involves multiple views). The logic of obedience to these properties is implemented at a later stage of the story.

You can use hit-testing yourself at any moment where it might prove useful. In calling hitTest:withEvent:, supply a point in the coordinates of the view to which the message is sent. The second parameter can be nil if you have no event.

For example, suppose we have a UIView with two UIImageView subviews. We want to detect a tap in either UIImageView, but we want to handle this at the level of the UIView. We can attach a UITapGestureRecognizer to the UIView, but how will we know which subview, if any, the tap was in?

Our first step must be to set userInteractionEnabled to YES for both UIImageViews. (This step is crucial; UIImageView is one of the few built-in view classes where this is NO by default, and a view whose userInteractionEnabled is NO won’t normally be the result of a call to hitTest:withEvent:.) Now, when our gesture recognizer’s action handler is called, the view can use hit-testing to determine where the tap was:

CGPoint p = [g locationOfTouch:0 inView:self]; // g is the gesture recognizer
UIView* v = [self hitTest:p withEvent:nil];

You can also override hitTest:withEvent: in a view subclass, to alter its results during touch delivery, thus customizing the touch delivery mechanism. I call this hit-test munging. Hit-test munging can be used selectively as a way of turning user interaction on or off in an area of the interface. In this way, some unusual effects can be produced.

For example, an important use of hit-test munging is to permit the touching of parts of subviews outside the bounds of their superview. If a view’s clipsToBounds is NO, a paradox arises: the user can see the regions of its subviews that are outside its bounds, but the user can’t touch them. This can be confusing and seems wrong. The solution is for the view to override hitTest:withEvent: as follows:

-(UIView *)hitTest:(CGPoint)point withEvent:(UIEvent *)event {
    UIView* result = [super hitTest:point withEvent:event];
    if (result)
        return result;
    for (UIView* sub in [self.subviews reverseObjectEnumerator]) {
        CGPoint pt = [self convertPoint:point toView:sub];
        result = [sub hitTest:pt withEvent:event];
        if (result)
            return result;
    }
    return nil;
}

Here are some further possible uses of hit-test munging, just to stimulate your imagination:

  • If a superview contains a UIButton but doesn’t return that UIButton from hitTest:withEvent:, that button can’t be tapped.
  • You might override hitTest:withEvent: to return the result from super most of the time, but to return self under certain conditions, effectively making all subviews untouchable without making the superview itself untouchable (as setting its userInteractionEnabled to NO would do).
  • A view whose userInteractionEnabled is NO can break the normal rules and return itself from hit-testing and can thus end up as the hit-test view.

Hit-testing for layers

There is also hit-testing for layers. It doesn’t happen automatically, as part of sendEvent: or anything else; it’s up to you. It’s just a convenient way of finding out which layer would receive a touch at a point, if layers received touches. To hit-test layers, call hitTest: on a layer, with a point in superlayer coordinates.

Keep in mind, though, that layers do not receive touches. A touch is reported to a view, not a layer. A layer, except insofar as it is a view’s underlying layer and gets touch reporting because of its view, is completely untouchable; from the point of view of touches and touch reporting, it’s as if the layer weren’t on the screen at all. No matter where a layer may appear to be, a touch falls right through the layer to whatever view is behind it.

In the case of the layer that is a view’s underlying layer, you don’t need hit-testing. It is the view’s drawing; where it appears is where the view is. So a touch in that layer is equivalent to a touch in its view. Indeed, one might say that this is what views are actually for: to provide layers with touchability.

The only layers on which you’d need special hit-testing, then, would presumably be layers that are not themselves any view’s underlying layer, because those are the only ones you don’t find out about by normal view hit-testing. However, all layers, including a layer that is its view’s underlying layer, are part of the layer hierarchy, and can participate in layer hit-testing. So the most comprehensive way to hit-test layers is to start with the topmost layer, the window’s layer. In this example, we subclass UIWindow and override its hitTest:withEvent: so as to get layer hit-testing every time there is view hit-testing:

- (UIView*) hitTest:(CGPoint)point withEvent:(UIEvent *)event {
    CALayer* lay = [self.layer hitTest:point];
    // ... possibly do something with that information ...
    return [super hitTest:point withEvent:event];
}

Because this is the window, the view hit-test point works as the layer hit-test point; window bounds are screen bounds (Chapter 14). But usually you’ll have to convert to superlayer coordinates. In this example, we return to the CompassView developed in Chapter 16, in which all the parts of the compass are layers; we want to know whether the user tapped on the arrow layer. For simplicity, we’ve given the CompassView a UITapGestureRecognizer, and this is its action handler, in the CompassView itself. We convert to our superview’s coordinates, because these are also our layer’s superlayer coordinates:

// self is the CompassView
CGPoint p = [t locationOfTouch: 0 inView: self.superview];
CALayer* hitLayer = [self.layer hitTest:p];
if (hitLayer == ((CompassLayer*)self.layer).arrow) // ...

Layer hit-testing works by calling containsPoint:. However, containsPoint: takes a point in the layer’s coordinates, so to hand it a point that arrives through hitTest: you must first convert from superlayer coordinates:

BOOL hit =
    [lay containsPoint: [lay convertPoint:point fromLayer:lay.superlayer]];

Layer hit-testing knows nothing of the restrictions on touch delivery; it just reports on every sublayer, even those whose view has userInteractionEnabled set to NO.

Warning

The documentation warns that hitTest: must not be called on a CATransformLayer.

Hit-testing for drawings

The preceding example (letting the user tap on the compass arrow) worked, but we might complain that it is reporting a hit on the arrow even if the hit misses the drawing of the arrow. That’s true for view hit-testing as well. A hit is reported if we are within the view or layer as a whole; hit-testing knows nothing of drawing, transparent areas, and so forth.

If you know how the region is drawn and can reproduce the edge of that drawing as a CGPath, you can test whether a point is inside it with CGPathContainsPoint. So, for a layer, you could override hitTest along these lines:

- (CALayer*) hitTest:(CGPoint)p {
    CGPoint pt = [self convertPoint:p fromLayer:self.superlayer];
    CGMutablePathRef path = CGPathCreateMutable();
    // ... draw path here ...
    CALayer* result = CGPathContainsPoint(path, nil, pt, true) ? self : nil;
    CGPathRelease(path);
    return result;
}

Alternatively, it might be the case that if a pixel of the drawing is transparent, it’s outside the drawn region, so that it suffices to detect whether the pixel tapped is transparent. Unfortunately, there’s no way to ask a drawing (or a view, or a layer) for the color of a pixel; you have to make a bitmap and copy the drawing into it, and then ask the bitmap for the color of a pixel. If you can reproduce the content as an image, and all you care about is transparency, you can make a one-pixel alpha-only bitmap, draw the image in such a way that the pixel you want to test is the pixel drawn into the bitmap, and examine the transparency of the resulting pixel:

// assume im is a UIImage, point is the CGPoint to test
unsigned char pixel[1] = {0};
CGContextRef context = CGBitmapContextCreate(pixel,
                                             1, 1, 8, 1, nil,
                                             kCGImageAlphaOnly);
UIGraphicsPushContext(context);
[im drawAtPoint:CGPointMake(-point.x, -point.y)];
UIGraphicsPopContext();
CGContextRelease(context);
CGFloat alpha = pixel[0]/255.0;
BOOL transparent = alpha < 0.01;

However, there can be complications; for example, there may not be a one-to-one relationship between the pixels of the underlying drawing and the points of the drawing as portrayed on the screen (because the drawing is stretched, for example). It’s a tricky problem, but in many cases, the CALayer method renderInContext: can be helpful here. This method allows you to copy a layer’s actual drawing into a context of your choice. If that context is, say, an image context created with UIGraphicsBeginImageContextWithOptions, you can now use the resulting image as im in the code above.

Hit-testing during animation

If user interaction is allowed during an animation that moves a view from one place to another, then if the user taps on the animated view, the tap might mysteriously fail because the view in the model layer is elsewhere; conversely, the user might accidentally tap where the view actually is in the model layer, and the tap will hit the animated view even though it appears to be elsewhere. If the position of a view or layer is being animated and you want the user to be able to tap on it, therefore, you’ll need to hit-test the presentation layer (see Chapter 17).

In this simple example, we have a superview containing a subview. To allow the user to tap on the subview even when it is being animated, we implement hit-test munging in the superview:

- (UIView*) hitTest:(CGPoint)point withEvent:(UIEvent *)event {
    // v is the animated subview
    CALayer* lay = [v.layer presentationLayer];
    CALayer* hitLayer = [lay hitTest: point];
    if (hitLayer == lay)
        return v;
    UIView* hitView = [super hitTest:point withEvent:event];
    if (hitView == v)
        return self;
    return hitView;
}

If the user taps outside the presentation layer, we cannot simply call super, because the user might tap at the spot to which the subview has in reality already moved (behind the “animation movie”), in which case super will report that it hit the subview. So if super does report this, we return self (assuming that we are what’s behind the animated subview at its new location).

However, as Apple puts it in the WWDC 2011 videos, the animated view “swallows the touch.” For example, suppose the view in motion is a button. Although our hit-test munging makes it possible for the user to tap the button as it is being animated, and although the user sees the button highlight in response, the button’s action message is not sent in response to this highlighting if the animation is in-flight when the tap takes place. This behavior seems unfortunate, but it’s generally possible to work around it (for instance, with a gesture recognizer).

Initial Touch Event Delivery

When the touch situation changes, an event containing all touches is handed to the UIApplication instance by calling its sendEvent:, and the UIApplication in turn hands it to the relevant UIWindow by calling its sendEvent:. The UIWindow then performs the complicated logic of examining, for every touch, the hit-test view and its superviews and their gesture recognizers and deciding which of them should be sent a touches... message, and does so.

These are delicate and crucial maneuvers, and you wouldn’t want to lame your application by interfering with them. Nevertheless, you can override sendEvent: in a subclass, and there are situations where you might wish to do so. This is just about the only case in which you might subclass UIApplication; if you do, remember to change the third argument in the call to UIApplicationMain in your main.m file to the string name of your UIApplication subclass so that your subclass is used to generate the app’s singleton UIApplication instance. If you subclass UIWindow, remember to change the window’s class in the app delegate code that instantiates the window.

Now that gesture recognizers exist, it is unlikely that you will need to resort to such measures. A typical case, in the past, was that you needed to detect touches directed to an object of some built-in interface class in a way that subclassing it wouldn’t permit. For example, you want to know when the user swipes a UIWebView; you’re not allowed to subclass UIWebView, and in any case it eats the touch. The solution used to be to subclass UIWindow and override sendEvent:; you would then work out whether this was a swipe on the UIWebView and respond accordingly, or else call super. Now, however, you can attach a UISwipeGestureRecognizer to the UIWebView.

Gesture Recognizer and View

When a touch first appears and is delivered to a gesture recognizer, it is also delivered to its hit-test view, the same touches... method being called on both. This comes as a surprise to beginners, but it is the most reasonable approach, as it means that touch interpretation by a view isn’t jettisoned just because gesture recognizers are in the picture. Later on in the multitouch sequence, if all the gesture recognizers in a view’s swarm declare failure to recognize their gesture, that view’s internal touch interpretation just proceeds as if gesture recognizers had never been invented.

However, if a gesture recognizer in a view’s swarm recognizes its gesture, that view is sent touchesCancelled:withEvent: for any touches that went to that gesture recognizer and were hit-tested to that view, and subsequently the view no longer receives those touches.

This behavior can be changed by setting a gesture recognizer’s cancelsTouchesInView property to NO. If this is the case for every gesture recognizer in a view’s swarm, the view will receive touch events more or less as if no gesture recognizers were in the picture. Making this change, however, alters delivery logic rather drastically; it seems unlikely that you’d want to do that.

If a gesture recognizer happens to be ignoring a touch (because it was told to do so by ignoreTouch:forEvent:), then touchesCancelled:withEvent: won’t be sent to the view for that touch when the gesture recognizer recognizes its gesture. Thus, a gesture recognizer’s ignoring a touch is the same as simply letting it fall through to the view, as if the gesture recognizer weren’t there.

Gesture recognizers can also delay the delivery of touches to a view, and by default they do. The UIGestureRecognizer property delaysTouchesEnded is YES by default, meaning that when a touch reaches UITouchPhaseEnded and the gesture recognizer’s touchesEnded:withEvent: is called, if the gesture recognizer is still allowing touches to be delivered to the view because its state is still Possible, it doesn’t deliver this touch until it has resolved the gesture. When it does, either it will recognize the gesture, in which case the view will have touchesCancelled:withEvent: called instead (as already explained), or it will declare failure and now the view will have touchesEnded:withEvent: called.

The reason for this behavior is most obvious with a gesture where multiple taps are required. The first tap ends, but this is insufficient for the gesture recognizer to declare success or failure, so it withholds that touch from the view. In this way, the gesture recognizer gets the proper priority. In particular, if there is a second tap, the gesture recognizer should succeed and send touchesCancelled:withEvent: to the view — but it can’t do that if the view has already been sent touchesEnded:withEvent:.

It is also possible to delay the entire suite of touches... methods from being called on a view, by setting a gesture recognizer’s delaysTouchesBegan property to YES. Again, this delay would be until the gesture recognizer can resolve the gesture: either it will recognize it, in which case the view will have touchesCancelled:withEvent: called, or it will declare failure, in which case the view will receive touchesBegan:withEvent: plus any further touches... calls that were withheld — except that it will receive at most one touchesMoved:withEvent: call, the last one, because if a lot of these were withheld, to queue them all up and send them all at once now would be simply insane.

It is unlikely that you’ll change a gesture recognizer’s delaysTouchesBegan property to YES, however. You might do so, for example, if you have an elaborate touch analysis within a view that simply cannot operate simultaneously with a gesture recognizer, but this is improbable, and the latency involved may look strange to your user.

When touches are delayed and then delivered, what’s delivered is the original touch with the original event, which still have their original timestamps. Because of the delay, these timestamps may differ significantly from now. For this reason (and many others), Apple warns that touch analysis that is concerned with timing should always look at the timestamp, not the clock.

Touch Exclusion Logic

It is up to the UIWindow’s sendEvent: to implement the logic of multipleTouchEnabled and exclusiveTouch.

If a new touch is hit-tested to a view whose multipleTouchEnabled is NO and which already has an existing touch hit-tested to it, then sendEvent: never delivers the new touch to that view. However, that touch is delivered to the view’s swarm of gesture recognizers.

Similarly, if there’s an exclusiveTouch view in the window, then sendEvent: must decide whether a particular touch should be delivered, as already described. If a touch is not delivered to a view because of exclusiveTouch restrictions, it is not delivered to its swarm of gesture recognizers either. (This behavior with regard to gesture recognizers has changed in a confusing and possibly buggy way from system to system, but I believe I’m describing it correctly for iOS 5 and later. The statement in Apple’s SimpleGestureRecognizers sample code that “Recognizers ignore the exclusive touch setting for views” now appears to be false.)

Recognition

When a gesture recognizer recognizes its gesture, everything changes. As we’ve already seen, the touches for this gesture recognizer are sent to their hit-test views as a touchesCancelled:forEvent: message, and then no longer arrive at those views (unless the gesture recognizer’s cancelsTouchesInView is NO). Moreover, all other gesture recognizers pending with regard to these touches are made to fail, and then are no longer sent the touches they were receiving either.

If the very same event would cause more than one gesture recognizer to recognize, there’s an algorithm for picking the one that will succeed and make the others fail: a gesture recognizer lower down the view hierarchy (closer to the hit-test view) prevails over one higher up the hierarchy, and a gesture recognizer more recently added to its view prevails over one less recently added.

There are various means for modifying this “first past the post, winner takes all” behavior. One is by telling a gesture recognizer, in effect, that being first isn’t good enough:

  • requireGestureRecognizerToFail: institutes a dependency order, possibly causing the gesture recognizer to which it is sent to be put on hold when it tries to transition from the Possible state to the Began (continuous) or Ended (discrete) state; only if a certain other gesture recognizer fails is this one permitted to perform that transition. Apple says that in a dependency like this, the gesture recognizer that fails first is not sent reset (and won’t receive any touches) until the second finishes its state sequence and is sent reset, so that they resume recognizing together.
  • The UIView method gestureRecognizerShouldBegin:, sent to the hit-test view, or the delegate method gestureRecognizerShouldBegin:, by returning NO, turns success into failure; at the moment when the gesture recognizer is about to declare that it recognizes its gesture, transitioning from the Possible state to the Began (continuous) or Ended (discrete) state, it is forced to fail instead.

Another approach is to permit simultaneous recognition; a gesture recognizer succeeds, but some other gesture recognizer is not forced to fail. There are two ways to achieve this:

  • A subclass can implement canPreventGestureRecognizer: or canBePreventedByGestureRecognizer: (or both). Here, “prevent” means “by succeeding, you force failure upon this other,” and “be prevented” means “by succeeding, this other forces failure upon you.”

    These two methods work together as follows. canPreventGestureRecognizer: is called first; if it returns NO, that’s the end of the story for that gesture recognizer, and canPreventGestureRecognizer: is called on the other gesture recognizer. But if canPreventGestureRecognizer: returns YES when it is first called, the other gesture recognizer is sent canBePreventedByGestureRecognizer:. If it returns YES, that’s the end of the story; if it returns NO, the process starts over the other way around, sending canPreventGestureRecognizer: to the second gesture recognizer, and so forth. In this way, conflicting answers are resolved without the device exploding: prevention is regarded as exceptional (even though it is in fact the norm) and will happen only if it is acquiesced to by everyone involved.

  • The delegate method gestureRecognizer:shouldRecognizeSimultaneouslyWithGestureRecognizer: can return YES to permit one gesture recognizer to succeed without forcing the other to fail.

Touches and the Responder Chain

A UIView is a responder, and participates in the responder chain (Chapter 11). In particular, if a touch is to be delivered to a UIView (because, for example, it’s the hit-test view) and that view doesn’t implement the relevant touches... method, a walk up the responder chain is performed, looking for a responder that does implement it; if such a responder is found, the touch is delivered to that responder. Moreover, the default implementation of the touches... methods — the behavior that you get if you call super — is to perform the same walk up the responder chain, starting with the next responder in the chain.

The relationship between touch delivery and the responder chain can be useful, but you must be careful not to allow it to develop into an incoherency. For example, if touchesBegan:withEvent: is implemented in a superview but not in a subview, then a touch to the subview will result in the superview’s touchesBegan:withEvent: being called, with the first parameter (the touches) containing a touch whose view is the subview. But most UIView implementations of the touches... methods rely upon the assumption that the first parameter consists of all and only touches whose view is self; built-in UIView subclasses certainly assume this.

Again, if touchesBegan:withEvent: is implemented in both a superview and a subview, and you call super in the subview’s implementation, passing along the same arguments that came in, then the same touch delivered to the subview will trigger both the subview’s touchesBegan:withEvent: and the superview’s touchesBegan:withEvent: (and once again the first parameter to the superview’s touchesBegan:withEvent: will contain a touch whose view is the subview).

The solution is to behave rationally, as follows:

  • If all the responders in the affected part of the responder chain are instances of your own subclass of UIView itself or of your own subclass of UIViewController, you will generally want to follow the simplest possible rule: implement all the touches... events together in one class, so that touches arrive at an instance either because it was the hit-test view or because it is up the responder chain from the hit-test view, and do not call super in any of them. In this way, “the buck stops here” — the touch handling for this object or for objects below it in the responder chain is bottlenecked into one well-defined place.
  • If you subclass a built-in UIView subclass and you override its touch handling, you don’t have to override every single touches... event, but you do need to call super so that the built-in touch handling can occur.
  • Don’t allow touches to arrive from lower down the responder chain at an instance of a built-in UIView subclass that implements built-in touch handling, because such a class is completely unprepared for the first parameter of a touches... method containing a touch not intended for itself. Judicious use of userInteractionEnabled or hit-test munging can be a big help here.

    I’m not saying, however, that you have to block all touches from percolating up the responder chain; it’s normal for unhandled touches to arrive at the UIWindow or UIApplication, for example, because these classes do not (by default) do any touch handling — so those touches will remain unhandled and will percolate right off the end of the responder chain, which is perfectly fine.

  • Never call a touches... method directly (except to call super).

Warning

Apple’s documentation has some discussion of a technique called event forwarding where you do call touches... methods directly. But you are far less likely to need this now that gesture recognizers exist, and it can be extremely tricky and even downright dangerous to implement, so I won’t give an example here, and I suggest that you not use it.