Functions

Functions as mappings

[Video Tutorials] (coming soon)

In this tutorial, we go over the basics of functions, which are really just a special form of relation.

Relations can have various properties, such as reflexive, symmetric, transitive, and more. Now we consider a special property a relation can have: relations on A\times B such that every element a\in A has exactly one element b\in B such that aRb, like we saw in the linear relation example in the Relations tutorial. This is a very useful property, and allows us to ask things like “which element from B is related to a?” Not the singular form, since there is exactly one element related to a.

This kind of relation is quite common. Consider A being the set of student ID numbers, and B being the set of names. Every student has exactly one name, so you might have ( 771273123, Mary Smith)\in R if the student with ID number 771273123 is named Mary Smith. Since there’s exactly one element of B (i.e., one name) for each student, you can ask things like “what is the name of student 771273123?” Note that we could have multiple students named Mary Smith, with different ID numbers, but each student (ID number) has only one name. [Fun fact: In December 2025, “Mary Smith” was the most common female name on the North Carolina voter registration roll, with 583 different active registered North Carolina voters named “Mary Smith” out of 6,575,049 total active registered voters. There were 1,141 “James Smith”s registered.]

When we have small sets for A and B, we can specify this relation using a table, like we did at the beginning of the Relations tutorial for the “is taking” relation using a table. The relation between student IDs and names is given in the table below. Notice how each ID only appears on one row, but the name “Mary Smith” appears twice, meaning there are two different students named Mary Smith.

Table 1: Defining a function with a table
Student ID Name
771273123 Mary Smith
771731322 James Williams
772337112 William Jones
773563721 Mary Smith
777604359 Linda Johnson

This is precisely what a function is, although we use language specific to functions in this case rather than the general relation terminology. In our student ID/name example, we could say that this function “maps” student 771273123 to their name (“Mary Smith”). We give the general, formal definition below.

Definition 1
A function f from set X to set Y relates, or maps, each element x\in X to exactly one element y\in Y. If f maps x to y, we can write f(x)=y. We read f(x) as “f of x”. The variable x is simply a placeholder, and is called the argument of f.

Specifying the form of the function can be written as f:X\rightarrow Y, which is read as “f maps X to Y.” The set X is called the domain of f, and Y is the codomain of f.

Two functions f:A\rightarrow B and g:C\rightarrow D are equal if and only if their domain and range are the same (i.e., A=C and B=D), and f(x)=g(x) for all x\in A.

More terminology for specific cases: When the domain and range are the same, as in f:X\rightarrow X, we call f a function on X or sometimes a function over X. For example, f:\mathbb{R}\rightarrow\mathbb{R} can be called a “function over the reals.” When the codomain is a set of numbers, we can refer to the function by the values it takes, saying (for example) that f:X\rightarrow\mathbb{R} is a “real-valued function” (with similar terms for integer-valued or rational-valued functions).

The typical convention in mathematical writing is that generic function names are lower case letters f, g, and h. In our tutorials we will sometimes indicate a specific function with a subscript (like f_{sq}).

Tip

Some students have the mistaken impression that parentheses in formulas are only for grouping, and are not necessary when not used to indicate order/precedence of operations. In the case of “f(x)”, the parentheses are a required part of the notation, and may not be omitted. Think of them as the “of” part when reading “f of x.” While we don’t cover big-oh notation in this tutorial, the situation is the same there: O(n) requires parentheses, and is read “big-oh of n.”

Functions are sometimes referred to as mappings since they map elements in the domain to elements of the codomain. There are various ways that functions can be specified. Sometimes they are given with algebraic formulas, such as defining the “squaring function” f_{sq}:\mathbb{R}\rightarrow\mathbb{R} with f_{sq}(x)=x^2. Notice that we could define a function f_{isq}:\mathbb{Z}\rightarrow\mathbb{Z} using the same formula f_{isq}(x)=x^2, but this would be a different function since the domain and codomain are different.

Algebraically defined functions can be more complicated than simple formulas, for example with piecewise definitions defined using cases as in the absolute value function f_{abs}:\mathbb{R}\rightarrow\mathbb{R}:

f_{abs}(x)=\begin{cases} x & \text{if } x\geq 0;\\ -x & \text{otherwise.} \end{cases}

We can even define functions using algorithms (or code) for even more complex definitions, but not everything you can write in code as a “function” is actually a mathematical function. See the discussion at the end of this tutorial for more information.

There is nothing that requires a function to actually depend on its argument, so f(x)=2 is a perfectly valid function (sometimes called a constant function).

Note

A function that isn’t defined on some points of the domain is called a partial function. For example, f:\mathbb{R}\rightarrow\mathbb{R} defined as f(x)=1/x is a partial function (and not a function) since f(0) is undefined. You can always restrict the domain, so in this case the same formula as f:\mathbb{R}-\{0\}\rightarrow\mathbb{R} is a function. We will not use the notion of partial functions in these tutorials.

While you may have seen a lot of these algebraic or numeric functions in previous math classes, remember that nothing in the definition requires it to have anything to do with numbers – it simply maps from one set to another.

Our initial example above, mapping student IDs to names is an example of a function with non-numeric domain and codomain. Let’s define another non-numeric function, mapping from the four main officers of a student club to students given with their ID numbers. We assume no shared offices, like co-presidents, so this is a function. Table 2 below shows this function, and we’ll use this function in a later example.

Table 2: Function mapping officers to students
Officer Student ID
President 771273123
Vice President 772337112
Secretary 777604359
Treasurer 771731322

A common visualization of functions shows a mapping as connections between elements in the domain to elements in the codomain, and we use small finite cases to illustrate general concepts about functions. For example, consider defining a function f:X\rightarrow Y where X=\{1,2,3,4\}, Y=\{a,b,c,d,e\}, and the function is defined by f(1)=c, f(2)=b, f(3)=d, and f(4)=c. We can visualize this as shown in the following figure.

Figure 1: Basic Function Visualization

Notice that every element in the domain maps to exactly one element in the codomain, but there is no requirement that every element of the codomain has just one (or even any at all) elements from the domain that map to it.

Definition 2
For a function f:X\rightarrow Y, the image of an element x\in X is the value f(x)\in Y. The image of subset S\subseteq X, which we can write as f(S), is the set of all values that elements in S map to:

f(S)=\{y\in Y\ |\ y=f(x)\text{ for } x\in S\} .

The range of function f is the image of the domain X.

The preimage of an element y\in Y is the set of values in the domain that map to that value: \{ x\in X\ |\ f(x)=y\}. We can also define the preimage of a subset of S\subseteq Y in the obvious way: \{ x\in X\ |\ f(x)\in S\}.

Example 3

Using the function illustrated in Figure 1, we give examples of these concepts:

  1. The image of 2 is b
  2. The image of \{2,3\} is \{b,d\}
  3. The image of \{1,4\} is \{c\}
  4. The preimage of c is \{1,4\}
  5. The preimage of b is \{2\}
  6. The preimage of a is \emptyset

Note that the image of an element in X is an element in Y (example i), and the image of a set is always a set, even when that set has just a single element (example iii).

The preimage of an element in Y is always a set, which can have multiple elements of the domain (example iv), a single element of the domain (example v), or even no elements (example vi – giving the empty set).

The range of the function shown in Figure 1 is the image of the domain, or equivalently the set of values in the codomain that have something mapping to them (or the set of values in the codomain with a non-empty preimage). For our function in Figure 1, the range is the set \{b,c,d\}, which is illustrated as the shaded area below.

Figure 2: Range of Function f
Tip

Since the terms “codomain” and “range” both refer to a set that a function maps to, students often have difficulty remembering which term means what. Here’s how I remember: In regular English, we might say something like “the final exam scores range from 38 to 98,” where those refer to actual grades. There is a hypothetically possible grade of 100, but if no one made that score we wouldn’t say the grades “range to 100.” So the use of “ranges” in English is analogous to the mathematical definition of “range.” While a score of 100 might be in the codomain, it is not in the range.

The range then is the set of values that are actually mapped to: the image of the domain. The codomain is a potentially larger set that is generally used for convenience – for example, defining the function f_{sq}:\mathbb{R}\rightarrow\mathbb{R} as a function over the reals, even though it only maps to non-negative reals.

When you talk about images and preimages, use articles (a, an, and the) correctly. You would talk about “the” image of an element in the domain (e.g., the image of 2). However, if you talk about the preimage of an element you should use “the” only if you’re talking about the entire preimage set. When talking about individual elements from the domain, you should use “a”. For example, you should say that 1 is a preimage of c.

Going back the f_{sq} function, we would say that the preimage of 4 is the set \{-2,2\}, or that 2 is a preimage of 4.

Function Properties

We can define various properties of functions, based on how they map to elements of the codomain.

Definition 4
A function f:X\rightarrow Y is injective if f(x)=f(y) only when x=y (in other words, no two different elements in the domain map to the same element in the codomain).

A function f:X\rightarrow Y is surjective if every element of Y has a non-empty preimage. Equivalently, we could say that every element in Y is mapped to by some element of the domain, or that the range is equal to the codomain.

Alternative terminology: Some people say “one-to-one” instead of injective, and “onto” instead of surjective. These are the exact same concepts, just using different terms.

Our original example function, in Figure 1, was neither injective (since f(1)=f(4)) nor surjective (since the preimages of a and e are both empty). We can make small changes to the function to satisfy these properties, as illustrated in the figure below.

(a) Injective Function
(b) Surjective Function
Figure 3

In Figure 3 (a), we define function g:X\rightarrow Y so that every element of the domain maps to a different element of the codomain, so this function is injective. However, since e has an empty preimage under g, it is not surjective.

In Figure 3 (b), we have the same basic mapping as in Figure 1, but have shrunk the codomain to match the range, giving a surjective function f':X\rightarrow Z. However, since f(1)=f(4), this function is not injective.

The second example, where we shrunk the codomain to fit the range, demonstrates an especially important point: The same mapping, over different sets, can have different properties. For example, f_{sq}: \mathbb{R}\rightarrow\mathbb{R} is not surjective, since no negative value in the codomain is mapped to. However, f_{sq}: \mathbb{R}\rightarrow\mathbb{R}^{\geq 0}, where \mathbb{R}^{\geq 0} is the set of all non-negative real numbers, is surjective. But this restriction on the codomain isn’t enough over the integers, as f_{sq}:\mathbb{Z}\rightarrow\mathbb{Z}^{\geq 0} is not surjective since 5 has no preimage over the integers.

Justifying that a function is injective or surjective is often not difficult, but a formal justification of this is a proof, and we postpone doing this formally until our tutorial on writing proofs. However, showing that a function is not injective, or not surjective, is much easier and generally is done by giving a specific counterexample to the required property.

Example 5
To show that a function is not injective, you can give two values x and y with x\neq y and f(x)=f(y). For example, with f_{sq}:\mathbb{R}\rightarrow\mathbb{R}, we can show that it is not injective by noting that f_{sq}(-2)=f_{sq}(2)=4.

To show that a function is not surjective, give a value in the codomain that is not mapped to. For example, for f_{sq}:\mathbb{R}\rightarrow\mathbb{R}, we show this is not surjective by pointing out that f_{sq}(x)=-1 has no solution over the reals (we would need to expand the domain and codomain to complex numbers to have a solution).

Definition 6
A function f:X\rightarrow Y is bijective if it is both injective and surjective.

Alternative terminology: Authors who use the other terminology mentioned above will call a bijection a “one-to-one and onto function” or a “one-to-one correspondence.”

We can visualize a bijective function as in the figure below – note that no point in the codomain has multiple elements mapping to it (so it is injective), and every point in the codomain is mapped to (so it is surjective).

Figure 4: Bijective Function

Illustrations like the one above use a finite number of points, and one important property of bijections over a finite domain is that the domain and codomain must have the same size. In fact, bijections we look at typically have the same domain and codomain, so the bijection mapping is just a permutation of the elements (if you don’t know that terminology, we’ll get to it in the combinatorics and counting tutorials). Since the domain and codomain must be the same size for a bijection, and the domain and codomain of our initial function visualization in Figure 1 have different sizes, no bijection is possible between these sets X and Y.

Bijections of infinite domains are typically defined algebraically, and we can use basic algebra to reason about them being bijections (or not). Two examples of this follow.

Example 7
Define f:\mathbb{R}\rightarrow\mathbb{R} as f(x)=ax+b, where a,b\in\mathbb{R} and a\neq 0. We will use basic algebra to show that f is bijective.

First, we show that a\,x+b is injective: If a\,x_1+b=a\,x_2+b, then we can subtract b from both sides to get a\,x_1=a\,x_2. Then, since a\neq 0 we can divide both sides by a to get x_1=x_2. So whenever f(x_1)=f(x_2), we have x_1=x_2, which is the definition of injective.

Next, we show that f is surjective. For any y\in\mathbb{R}, let x=\frac{y-b}{a} (remember that a is non-zero, so this is well-defined). We claim that f(x)=y, and since we can find such an x for every y\in\mathbb{R} it follows that f is surjective. To see this, note that

f(x)=f\left(\frac{y-b}{a}\right)=a\,\frac{y-b}{a}+b = y-b+b = y . \tag{1}

In the previous example, the requirement a\neq 0 just means that f is not a constant function. Since there are no other restrictions on a and b, meaning f can be any non-constant linear function, the previous example shows that all non-constant linear functions are bijections.

Example 8
We define f:\mathbb{R}\rightarrow\mathbb{R} as a function over the reals with f(x)=x^2-2x+2, and show that this function is not bijective. This function, in fact, is neither injective nor surjective.

To see that f(x)=x^2-2x+2 is not injective, consider f(0)=2 and f(2)=2^2-2\cdot 2+2=4-4+2=2. So f(0)=f(2), meaning f is not injective.

To see that f(x) is not surjective, consider y=0. To find x values such that f(x)=0 we can use the quadratic formula to find all complex roots, giving 1+i and 1-i as the only two roots of f(x) over \mathbb{C}. Since neither root is real, this means that at y=0 function f (a function over the reals) has an empty preimage, so f is not surjective.

Note how important the domain and codomain are: If we had defined the function as f:\mathbb{C}\rightarrow\mathbb{C} (using the same algebraic formula), then f would be surjective. It still would not be injective however.

Rather than considering injective and surjective properties explicitly, we can argue that a function is bijective using an equivalent condition on preimages, as stated in the following theorem.

Theorem 9
Function f:X\rightarrow Y is a bijection if and only if the preimage set for every y\in Y has cardinality 1.

For the special case of bijective functions, we will sometimes refer to the unique value x\in X with f(x)=ythe preimage of y,” even though the preimage of y is technically the set containing this element.

We next introduce a special function which just maps every element to itself, which seems like it isn’t useful, but will be important below when we talk about operations on functions.

Definition 10
For any set S, let I_S:S\rightarrow S be the identity function on S, which maps every element to itself. In other words, I_S(x)=x for all x\in S, and I_S is bijective.

Warning

Unfortunately, notation for the identity function is not standardized across all authors. In our tutorials, we use I_S for the identity function on S, which is the same as Susanna Epp uses in her Discrete Mathematics book. However, Rosen uses \iota_S (that’s the lower-case Greek iota), and Hein uses \text{id}_S. Our usual reminder applies: if you’re reading these tutorials while taking a discrete mathematics class, make sure you use whatever notion your class uses!

Finally, we introduce an important property for numeric functions, which gives a sufficient condition for such a function to be injective (and bijective).

Definition 11
A function f:X\rightarrow X, where X is some set of numbers, is increasing if f(x)>f(y) whenever x>y. f is non-decreasing if f(x)\geq f(y) whenever x>y.

If you find the notation confusing, then stop and think through what this is saying. If f is increasing, whenever x increases then f(x) also increases. If f is non-decreasing, whenever x increases, f(x) does not get any smaller.

Warning

Unfortunately, we are aware of at least one author of a popular discrete mathematics textbook that uses terminology differently, saying any non-decreasing function is “increasing” and using “strictly increasing” for what is commonly called an increasing function. While we repeat our advice to use whatever terminology your course or context requires, with deference to other author’s choices when appropriate, this is one “terminology difference” that we feel is simply wrong. “Increasing” has a meaning, and it means it gets bigger. Non-decreasing does not mean “increasing” any more than “less than” can mean “less than or equal.”

Increasing and non-decreasing functions have important properties, but the most important one for now is the following.

Theorem 12
If f:X\rightarrow X is an increasing function, then f is injective. Furthermore, if f is surjective, then f is bijective.

We’ll prove this in the tutorials on proofs, but the intuition is straightforward: all values y>x have f(y)>f(x), so we can never repeat a value in the mapping.

Operations on functions

You can perform operations on functions to produce new functions, much like you can perform operations on numbers (addition, subtraction, …) to produce new numbers, or operations on sets (union, intersection, …) to produce new sets. If you have two functions that have the same domain, and codomains of numbers, then you can perform standard algebraic operations with them.

Definition 13
If f_1:X\rightarrow \mathbb{R} and f_2:X\rightarrow\mathbb{R} are real-valued functions, then we can define new functions as follows:

  • (f_1+f_2):X\rightarrow\mathbb{R} is the function defined that maps (f_1+f_2)(x)=f_1(x)+f_2(x) for all x\in X.

  • (f_1-f_2):X\rightarrow\mathbb{R} is the function defined that maps (f_1-f_2)(x)=f_1(x)-f_2(x) for all x\in X.

  • (f_1f_2):X\rightarrow\mathbb{R} is the function defined that maps (f_1f_2)(x)=f_1(x)f_2(x) for all x\in X.

This same idea can be applied to functions with codomain \mathbb{Z} or \mathbb{Q}. Note that, even with \mathbb{Q} and \mathbb{R} we cannot necessarily do the same with division, since f_2(x) might be zero for some x and then the quotient wouldn’t be defined at that point.

Important

There is an important, but subtle point to understand. We are defining a new function with f_1+f_2, not just evaluating using two functions. We can talk about the properties of function f_1+f_2, whereas if we just do two evaluations with f_1(x)+f_2(x) we haven’t defined a new function and so can’t talk about whether the operation we’re implying is injective, surjective, etc.

Let’s consider an example showing these operations.

Example 14
In this example, we define f_1 and f_2 as functions over the reals, with f_1=2x+1 and f_2=x+5.

We define a new function f_3:\mathbb{R}\rightarrow\mathbb{R} as f_3=f_1+f_2, and we can get an algebraic formula for f_3 as

f_3(x)=f_1(x)+f_2(x)=(2x+1)+(x+5)=3x+6 \ .

To emphasize the earlier important note: the function f_3 is the mapping f_3(x)=3x+6. It is not just evaluating f_1(x) and f_2(x) as if they were functions to call in a programming language. Where f_3 “came from” or how it was defined is irrelevant to the fact that f_3(x)=3x+6.

Next, we define function f_4:\mathbb{R}\rightarrow\mathbb{R} as f_4=f_1f_2, so

f_4(x)=(2x+1)(x+5)=2x^2+10x+x+5=2x^2+11x+5\ .

All of the definitions and operations in this example are operations on polynomials, and while we don’t get into this at this point, think about how the degree of f_1+f_2 and f_1f_2 depend on the degree of the polynomials f_1 and f_2. This is very important in some applications!

Next, we look at an operation that can be performed on arbitrary functions, not just on numeric functions.

Definition 15
If f:X\rightarrow Y and g:Y\rightarrow Z are functions, then the composition of f and g, written g\circ f, is a function (g\circ f): X\rightarrow Z in which (g\circ f)(x)=g(f(x)) for all x\in X.

Note that when you evaluate the function g\circ f at a point x, the functions are applied in right-to-left order. In other words, you first calculate y=f(x) and then you calculate g(y). The right-to-left order reflects the parenthesized evaluation g(f(x)).

Warning

The terminology for composition varies in subtle and unfortunate ways between authors. When we say “the composition of f and g” we use the word “and” for which order does not matter in either English or in logic. For example, “eggs and bacon” is the same as “bacon and eggs.” However, we’ll see in the following example that the order of composition does matter, so “the composition of f and g” is not the same as “the composition of g and f.” Remember that the ordering is reversed in the notation: the composition of f and g is g\circ f. Unfortunately, at least one author of a popular discrete math textbook writes this in the reverse, calling g\circ f the “composition of g and f.” The meaning of the notation g\circ f is standard and the same in all cases – it’s just the English language expression that differs. If you are a student, make sure you know what your instructor requires!

Example 16
Consider f(x)=x+1 and g(y)=y^2+y+1. We’ll first evaluate g\circ f at a couple of specific points, using the formula g(f(x)).

When x=0, we have f(x)=1, and g(f(x))=g(1)=1^2+1+1=3.

When x=1, we have f(x)=2, and g(f(x))=g(2)=2^2+2+1=7.

Next, we’ll see what the function g\circ f is. To do this, note that when evaluating g(y), y can be anything – it doesn’t just mean plugging in a specific number. In other words, we can replace each y in the formula for g(y) with f(x) and then expand the resulting f(x)’s, getting

(g\circ f)(x)=g(f(x))=[f(x)]^2+f(x)+1 =(x+1)^2+(x+1)+1

= (x^2+2x+1)+(x+1)+1 = x^2+3x+3 .

Evaluating at the two points we did above, seeing that (g\circ f)(0)=3 and (g\circ f)(1)=1^2+3\cdot 1+3=7. They match!

Let’s try reversing the order of the composition, getting

(f\circ g)(x)=f(g(x))=g(x)+1=(x^2+x+1)+1=x^2+x+2 .

We can see that (f\circ g)(0)=2\neq (g\circ f)(0), which shows that the function composition operator is not commutative – order matters!

Important point: The name of the parameter in a function definition is not relevant, and just shows where values are inserted into the formula. It’s just a symbol, and we could just as easily have defined g(\clubsuit)=\clubsuit^2+\clubsuit+1. But more importantly, note that we used the variable y in the definition of g to avoid confusion, but it was completely unnecessary. Defining g(x)=x^2+x+1 is exactly the same function, and this is the super-important part: When substituting g(x) in to function f to create f(g(x)), the x in the definition of f is completely different from the x in the definition of g. Substituting g(x) into f(x)=x+1 means replacing every x on the right-hand side with g(x). We’re not saying x=g(x) mathematically, but rather are doing a textual substitution everywhere x appears.

If you’re computing the composition of two functions that use the same parameter name, you can avoid confusion by temporarily rewriting one of the functions to use a different and unique parameter name.

Composing non-numeric functions can’t be done algebraically, but can be evaluated point by point.

Example 17

Let f be the function from officers to student IDs, given above in Table 2, and let g be the function from student IDs to names as shown in Table 1. Then the composition of f and g, namely g\circ f, maps officers to names.

For example, f( Secretary )= 7776042359 and g( 7776042359 )= Linda Johnson, so composing these functions gives (g\circ f)( Secretary )= Linda Johnson.

Evaluating the composition at each point in the domain (officers), we map to student ID and then to name, giving the following function.

Composition of officer to ID and ID to name functions
Officer Student ID Name
President 771273123 Mary Smith
Vice President 772337112 William Jones
Secretary 777604359 Linda Johnson
Treasurer 771731322 James Williams

Inverse function

We next consider the following idea: In the pictures above, we have arrows mapping from the domain to the codomain. What if we reversed the directions of those arrows, so they pointed from the codomain to the domain? Could that be a valid function? Using this natural question, we’ll briefly refer to this as the “reverse mapping” of f, but emphasize that this is not a standard term. We’re only using it temporarily to give some intuition behind an important concept.

Let’s go back to the definition of a function (Definition 1) which says a function: maps “each element of x\in X to exactly one element y\in Y”. When we reverse the direction of the mapping, Y becomes the domain and X becomes the codomain, so we would need it to map “each element of y\in Y to exactly one element x\in X”. If f is not surjective, then it doesn’t map each element of Y. If f is not injective, then it maps an element of Y to more than one element of X. In other words, if f is either not injective or not surjective then the reverse mapping is not a function.

It turns out that the converse of this is also true: if the function f is both injective and surjective (in other words, it is bijective), then the reverse mapping is a function.

Referring back to Theorem 9, if f is a bijection then the preimage of every y\in Y has cardinality 1, so the reverse mapping will map y to its unique preimage. This is called the inverse function, which we formalize in the following definition.

Definition 18
If f:X\rightarrow Y is a bijective function, then the function is called invertible, and the inverse of f is a function f^{-1}:Y\rightarrow X such that f^{-1}(y) is the unique element of X in the preimage of y under f. It follows that for every y\in Y, f(f^{-1}(y))=y, and therefore f\circ f^{-1}=I_Y (the identity function on Y).

Tip

The notation for inverse confuses some students, since they may have seen the power “-1” used to represent a multiplicative inverse – in other words, for non-zero x\in\mathbb{R}, x^{-1} is just 1/x. However, when f is a function, this powering notation does not mean division or a fraction or a multiplicative inverse or anything like that. It means the functional inverse, as defined above. The notation actually comes from looking at functional composition in the context of a mathematical structure known as a “group,” but it’s not important that you know that for basic discrete mathematics topics. Just know that this is a different use of the power of “-1,” and don’t confuse it with what that means for numbers!

Example 19
In Example 7 we argued that the function f(x)=ax+b is bijective, using x=\frac{y-b}{a}. While we didn’t have the terminology at the time, this is the inverse of f. In other words, f^{-1}(y)=\frac{y-b}{a}. We showed that f(f^{-1}(y))=y in Equation 1.

Example 20
We saw in Example 5 that f_{sq}: \mathbb{R}\rightarrow\mathbb{R} is neither injective nor surjective, so it does not have an inverse. We can either expand the domain and codomain to \mathbb{C} to make this bijective, or we can restrict the domain and codomain to just the non-negative reals.

Specifically, consider f_{sq+}:\mathbb{R}^{\geq 0}\rightarrow\mathbb{R}^{\geq 0}. This is an increasing function on \mathbb{R}^{\geq 0}, and is surjective, so by Theorem 12 the function f_{sq+} is bijective, and hence has an inverse.

Our counterexample showing f_{sq} is not injective was pointing out that f_{sq}(-2)=f_{sq}(2)=4, but with the domain restricted to just non-negative reals f_{sq+}(x)=4 has only one solution, and we see that f_{sq+}^{-1}(4)=2.

Important

A lot of students have a mistaken idea about the square root function and notation. In particular, the value of a square root of a non-negative real is a single value, and it is always positive. In other words, \sqrt{4}=2, and it is not -2 or \pm 2 or anything else that indicates negative or multiple values. More precisely, the square root of a non-negative real is the inverse of the f_{sq+} function as discussed in the previous example.

Some of this confusion comes from learning how to solve for x in equations such as x^2=4. Some students mistakenly think you can just “take the square root of both sides,” to get x=\sqrt{4}, but that’s not the correct square root of the left hand side (see the test your understanding below for more). So what is the solution? There are two solutions, and we can write x=\pm 2 to indicate that plus or minus 2 are solutions. However, this is not the square root, and it is not correct to say that \sqrt{4}=\pm 2.

To generalize a little, consider solving for x in x^2=y. The solutions are x=\pm \sqrt{y}, and you need the \pm outside the square root since the square root gives only the positive value.

Test your understanding: What is \sqrt{x^2}?

The correct answer is \sqrt{x^2}=|x|. Note that it is not just x. The square root function doesn’t just “undo” the squaring of x – in fact, if x\in\mathbb{R} then it is impossible to “undo” the squaring of x since f_{sq} is not injective over \mathbb{R}, and you lose information about x when it is squared! Going back to our example above, with x^2=4, taking the square root of both sides you would actually get |x|=2, which shows more clearly why x=\pm 2.

Theorem 21
If f:X\rightarrow Y is a bijective function with inverse f^{-1}:Y\rightarrow X, then (f^{-1})^{-1}=f. In other words, the inverse of the inverse of f is just f.

Example 22
We demonstrate the previous theorem with a very simple example, using f:\mathbb{Z}\rightarrow\mathbb{Z} with f(x)=x+1. In other words, f just makes the value x one larger (this is sometimes called the “successor function”). The inverse of this just makes x one smaller, so f^{-1}(x)=x-1. One way to to look at this is that f takes one step to the right on the number line, and f^{-1} “undoes” that by taking one step to the left.

What is the inverse of f^{-1}? With our “number line” interpretation, the inverse of f^{-1} (taking a step left) is taking a step right. In other words, (f^{-1})^{-1}=x+1, which is just f.

Finally, we consider taking the inverse of the composition of two functions. If f:X\rightarrow Y and g:Y\rightarrow Z, then the composition is g\circ f:X\rightarrow Z. If we want to reverse this mapping we would be mapping from Z to X. The only mapping from Z is g^{-1}, so g^{-1} must be applied first, whereas in g\circ f we apply f first. Taking the next step, it means that when we want the inverse of g\circ f, we must reverse the order in which the functions are applied. This is formalized in the following theorem.

Theorem 23
If f:X\rightarrow Y and g:Y\rightarrow Z are bijective functions, then (g\circ f) is bijective, with (g\circ f)^{-1}=f^{-1}\circ g^{-1}.

Programming Functions

We close our discussion of functions with a brief discussion of the difference between mathematical functions (what we’re interested in here), and programming functions as you might see them in Python, Java, or another programming language. Some of you reading these tutorials might have significant programming background, and think functions are those pieces of code you write that take parameters and return a value. While these are similar concepts, which is why they share a name, there are at least two important differences.

  • Mathematical functions are fixed mappings. In other words, if you look at the value of f(x) it will always be the same whenever x is the same. This is not the case with programming language functions, where the function computation may depend on external state (like global variables or values from a database) or might even be designed to return different values every time (like the pseudorandom number generation “function” rand()). Functional programming languages, like Haskell or ML, are designed to minimize this “state-sensitive” function evaluation, but it’s impossible to avoid it entirely in programming.

  • Mathematical functions do not need to be computable. Since a programming language function is defined by code that can be executed, as long as it works and gives a value for all input parameters, then it is by definition a computable function. While we explore this concept in more detail in other tutorials, it is an important fact that there are functions that are not computable, so can’t be implemented as a programming language function. If you don’t understand the following, don’t stress about it, but the vast majority of functions are not computable. In other words, if you had a way of picking a random function over the integers, the probability that it would be computable would be zero. This follows from the fact that the set of programs is countably infinite, and the set of functions is uncountably infinite. We even know specific functions that are uncomputable, such as a famous problem known as the “halting problem.” These results about computability are among the most fundamental and important results in all of computer science, and we will look at this in more depth in other tutorials.

Practice Problems

  1. Let X=\{1,2,3,4\} and Y=\{5,6,7,8\}. For each relation between X and Y below, state whether it is a function. If not, explain why not.

    1. Relation R_1:

      X Y
      1 6
      4 8
      2 7
      3 5
      2 6
    2. Relation R_2:

      X Y
      1 6
      4 8
      2 7
      3 5
    3. Relation R_3:

      X Y
      1 6
      4 8
      2 7
    4. Relation R_4:

      X Y
      1 5
      4 5
      2 5
      3 5
    1. R_1 is not a function, since two values are related to 2 (a function needs a unique mapping for each element of the domain).

    2. R_2 is a function: each element of the domain maps to exactly one element of the domain. This is in fact a bijective function, although the question didn’t ask about that.

    3. R_3 is not a function, since 3\in X has no value in the codomain related to it (i.e., 3 does not map to a value in Y).

    4. R_4 is a function: each element of the domain maps to exactly one element of the codomain. There’s no requirement that elements of the domain map to different values (which would make it an injective function), and the “constant function” shown here is a perfectly valid function.

  2. Is f(x)=\frac{x+2}{x-1} a function over the reals? Justify your answer.

    No, this is not a function since it is not defined on one value in the domain, specifically x=1. It is a partial function.

  3. Let X be the set of all dates (month, day, and year), and Y be the set of people.

    1. Define R_1\subseteq X\times Y as the relation between dates and people where xRy if x\in X was born on day y\in Y. Is this a function? Justify your answer.

    2. Define R_2\subseteq Y\times X as the relation between people and dates where yRx if x\in X was born on day y\in Y. Is this a function? Justify your answer.

    1. This is not a function, since multiple people (or no people!) can be born on a day; therefore, there is not a unique person for each day to map to.

    2. This is a function, with the real-world context that even though birth is not instantaneous, people are always assigned a single day as the date of their birth.

  4. For this problem, let X=\{1,2,3,4\}, Y=\{5,6,7,8\}, and Z=\{9,10,11,12,13\}. For each function below, determine if it is injective, surjective, and/or bijective. Justify your answers.

    1. Function f:X\rightarrow Z with f(1)=10, f(2)=11, f(3)=12, f(4)=13.

    2. Function g:X\rightarrow Y with g(1)=6, g(2)=7, g(3)=8, g(4)=5.

    3. Function h:X\rightarrow Y with h(1)=4, h(2)=4, h(3)=4, h(4)=4.

    1. f is injective, since every “right hand side” is different, but it is not surjective since no value maps to 9.

    2. g is both injective (since every value on the right hand side is different) and surjective (since every value in Y is mapped to). This means that g is bijective.

    3. h is neither injective (since multiple domain values map to 4) nor surjective (since no domain value maps to 5).

  5. Define function f:\mathbb{R}^2\rightarrow\mathbb{R}^2 by f((x,y))=(\min(x,y),\max(x,y)). Is f injective? Is it surjective? Justify your answer.

    This function is neither injective nor surjective.

    • Not injective: The function maps both of the pairs (1,2) and (2,1) to (1,2), so the function is not injective. Note that justification here could use any two points that map to the same output pair, but whenever possible you should give specific points as a counterexample!

    • Not surjective: The function always maps to a pair (a,b) with a\leq b, since a is the minimum of the input values and b is the maximum. Therefore, it is impossible for any input to map to a pair like (2,1) where the first element of the pair is larger. Since this (2,1)\in\mathbb{R}^2 and can’t be the image of any value in the domain, the function is not surjective.

    Additional note: One way to look at why this function is not injective is that it destroys some information. The argument (x,y) is an ordered pair, so contains information about the values and an ordering. Since the ordering of the output pair is fixed (non-decreasing), the only information in the pair is the values – ordering is no longer meaningful, so there’s less information in the function output than in the input. This is not a precise statement (although it can be made precise), but hopefully the intuition of “destroying information” helps.

  6. Define function f:\mathbb{R}^2\rightarrow\mathbb{R}^2 by f((x,y))=(x+y,x-y). Is f injective? Is it surjective? Justify your answer.

    This function is both injective and surjective. To justify this, we’ll give the inverse function, meaning that it is invertible (bijective) and hence both injective and surjective.

    Consider g((a,b))=(\frac{a+b}{2},\frac{a-b}{2}). This is defined for all (a,b)\in\mathbb{R}^2 and we see that

    \begin{array}{ll} g(f((x,y))) & = g((x+y,x-y)) \\ & = \rule{0pt}{2em} \displaystyle \left(\frac{(x+y)+(x-y)}{2},\frac{(x+y)-(x-y)}{2}\right) \\ & = \rule{0pt}{2em} \displaystyle\left(\frac{2x}{2},\frac{2y}{2}\right)\\ & = \rule{0pt}{1.5em} (x,y) . \end{array}

    Therefore, g is the inverse of f, meaning that f is invertible and a bijection. It is therefore both injective and surjective.

  7. Let S=\{a,b\} be the set of two characters a and b, and S^* be the set of all strings using these two characters (really just sequences of elements from S). Examples of strings in S^* include “a”, “aba”, and “aaab”. Order matters with strings, and we include the empty string in S^* as well.

    For the following definitions, give the domain and range and say (with justification) whether they are injective.

    1. The length function (giving the number of characters in a string)
    2. The string reversal function (so, for example, “aabab” would be mapped to “babaa”)
    3. A function which maps a string s to s followed by its reversal (so, for example, “aabab” would map to “aababbabaa”)
    4. The palindrome test function (indicate whether a string is the same as its reverse)
    5. A function that repeats each character in a string twice (so, for example, “aabab” would map to “aaaabbaabb”)
    1. The domain is S^* and the range is \mathbb{N}. Note that the codomain can be larger, so the codomain could be \mathbb{Z}, but since there’s no way to have a negative length the range must be \mathbb{N}. Note that it must also include zero (for the empty string), so \mathbb{Z}^+ would not be correct. This function is not injective (strings “a” and “b” both map to 1, for instance).

    2. The domain and range are both S^*, and this function is injective. In fact, the function is its own inverse, since reversing a string twice gives the original string.

    3. The domain is S^*, and the range is the set of even-length palindromes (strings that are the same forwards and backwards). This function is injective, since if s is an output of the function then the first half of s is the one and only string that maps to this output.

    4. The domain is S^* and the range is \{\text{true},\text{false}\}. This is obviously not injective, as many strings are palindromes (and many are not).

    5. The domain is S^*. The range is a little complicated, but it’s the set of even length strings in which characters are repeated once. This function is injective (just take every other character as the inverse). If the function were defined with S^* as the codomain, it would still be injective, but it would not be surjective so wouldn’t be invertible.

  8. For the following pairs of functions over the reals, give f_1+f_2 and f_1f_2.

    1. f_1(x)=1
      f_2(x)=x

    2. f_1(x)=x+1
      f_2(x)=x-1

    3. f_1(x)=x^2+2x+1
      f_2(x)=x+3

    A subtle but important point for these answers: You should not write this as f_1(x)+f_2(x), since that’s adding together two function evaluations,, not defining a new function. The notation should be as shown below!

    1. (f_1+f_2)(x)=x+1
      (f_1f_2)(x)=x

    2. (f_1+f_2)(x)=2x
      (f_1f_2)(x)=x^2-1

    3. (f_1+f_2)(x)=x^2+3x+4
      (f_1f_2)(x)=(x^2+2x+1)(x+3)=x^3+5x^2+7x+3

  9. For the following pairs of functions over the reals, give f_2\circ f_1 and f_1\circ f_2.

    1. f_1(x)=1
      f_2(x)=x

    2. f_1(x)=x+1
      f_2(x)=x-1

    3. f_1(x)=x^2+2x+1
      f_2(x)=x+3

    1. (f_2\circ f_1)(x)=1
      (f_1\circ f_2)(x)=1

    2. (f_2\circ f_1)(x)=(x+1)-1=x
      (f_1\circ f_2)(x)=(x-1)+1=x

    3. (f_2\circ f_1)(x)=(x^2+2x+1)+3=x^2+2x+4
      \begin{array}{ll}(f_1\circ f_2)(x)&=(x+3)^2+2(x+3)+1=x^2+6x+9+2x+6+1\\&=x^2+8x+16\end{array}

      When checking your answers for part c, make sure you don’t have these switched. Order is important!

  10. Let f_1(x) is a polynomial over the reals with degree \text{deg}(f_1(x))=d_1, and f_2(x) be a polynomial over the reals with degree \text{deg}(f_2(x))=d_2.

    1. What is the maximum possible degree of f_1 +f_2?

    2. What is the degree of f_1f_2?

    3. What is the degree of f_2\circ f_1?

    1. \text{deg}(f_1+f_2)\leq\max(\text{deg}(f_1),\text{deg}(f_2))

      This follows from properties of the sum of polynomials. Note that this problem asked for the “maximum possible degree” since it is possible for the leading terms in f_1 and f_2 to cancel out, reducing the degree of the result (for example, if f_1(x)=-f_2(x), then the entire polynomials cancel out regardless of the degree of f_1(x).

    2. \text{deg}(f_1f_2)=\text{deg}(f_1)+\text{deg}(f_2)

      This is again a property of the product of polynomials. If the leading term of f_1(x) is a_nx^n and the leading term of f_2(x) is b_mx^m, then the leading term of the product (f_1f_2)(x) is a_nb_mx^{n+m}.

    3. \text{deg}(f_2\circ f_1)=\text{deg}(f_1)\cdot\text{deg}(f_2)

      This one is a little trickier, and maybe something you haven’t seen directly in previous math classes. If the leading term of f_2(x) is b_mx^m, then composing the functions involves computing b_m(f_2(x))^m. If the leading term of f_2(x) is a_nx^n, then the leading term of the mth power of f_2(x) is (a_n)^m(x^{n})^{m}=(a_n)^mx^{nm}. Therefore, the leading term in the composed function is b_m(a_n)^mx^{nm}, making the composed function degree nm.

  11. Give that f_1 and f_2 are polynomials over the reals, with f_1(x)=x^2 and (f_1\circ f_2)(x)=x^2+2x+1, what is f_2(x)?

    f_2(x)=x+1.

    To derive this, consider that (f_1\circ f_2)(x)=(f_2(x))^2, and the problems gives that (f_1\circ f_2)(x)=x^2+2x+1. Factoring this last polynomial gives (f_1\circ f_2)(x)=(x+1)^2, so f_2(x)=x+1.

  12. Let f_1(x) and f_2(x) be linear functions over the reals, with f_1(x)=a_1x+b_1 and f_2(x)=a_2x+b_2. Is f_1\circ f_2=f_2\circ f_2 always, sometimes, or never? If “sometimes” then under what condition is this true?

    We first compute the composed functions:

    (f_1\circ f_2)(x)=a_1(a_2x+b_2)+b_1=a_1a_2x+a_1b_2+b_1

    (f_2\circ f_1)(x)=a_2(a_1x+b_1)+b_2=a_1a_2x+a_2b_1+b_2

    Since the coefficient on the x term is the same in both cases, these functions are equal exactly when

    a_1b_2+b_1=a_2b_1+b_2 \ . \tag{2}

    Exploring this condition a little shows that there are many ways to satisfy it – for example, when a_1=a_2=1, then the condition is met for any b_1 and b_2 values. Furthermore, when neither a_1 nor a_2 are one, given a_1, a_2, and b_1 we can always find a b_2 that makes the condition true. For example, with a_1=3, a_2=5, and b_1=1, we can solve for b_2 to get b_2=2. However, no other b_2 satisfy the condition for these a_1, a_2, and b_1 values, so the bottom line is that the answer to the question is “sometimes,” and the condition is given by Equation 2.

  13. The tutorial mentioned that I_S:S\rightarrow S, the identity function on set S, is bijective. That means that I_S is invertible – what is the inverse of I_S?

    The inverse of I_S is just I_S – in other words, it is its own inverse. We can see this by noticing that since I_S(x)=x for all x\in S, we also have I_S(I_S(x))=x for all x\in S.