PrevNext

Introduction

Resources
IUSACO

module is based off this

CPH

covers similar material

A set is a collection of unique elements. Sets have three primary methods:

  • one to add an element
  • one to remove an element
  • one to check whether an element is present

A map is a collection of entries, each consisting of a key and a value. In a map, all keys are required to be unique (i.e., they will form a set), but values can be repeated. Maps have three primary methods:

  • one to add a specified key-value pairing
  • one to remove a key-value pairing
  • one to retrieve the value for a given key

C++ and Java both have two implementations of sets and maps; one uses sorting while the other uses hashing. Python's implementation of sets and maps uses hashing.

Sets

Focus Problem – try your best to solve this problem before continuing!

View Internal Solution

Sorted Sets

Sorted sets store elements in sorted order. All primary methods (adding, removing, and checking) run in O(logN)\mathcal{O}(\log N) worst-case time, where NN is the number of elements in the set.

Warning!

Ignore this section, as Python doesn't implement sorted sets.

Hashsets

Hashsets store elements using hashing. Roughly, a hashset consists of some number of buckets BB, and each element is mapped to a bucket via a hash function. If BNB\approx N and the hash function independently maps each distinct element to a uniformly random bucket, then no bucket is expected to contain many elements, and all primary methods will all run in O(1)\mathcal O(1) expected time.

Warning!

In the worst case, hashsets in Python may take proportional to NN time per operation. This will be demonstrated later in the module.

Python's built-in set uses hashing to support O(1)\mathcal{O}(1) insertion, deletion, and searches. Some operations on a Python set named s include:

  • s.add(x): adds element x to s if not already present
  • s.remove(x): removes an element x from set if present
  • x in s: checks whether s contains the element x
s = set()
s.add(1) # {1}
s.add(4) # {1, 4}
s.add(2) # {1, 4, 2}
s.add(1) # {1, 4, 2}
# the add method did nothing because 1 was already in the set
print(1 in s) # True
s.remove(1) # {4, 2}
print(5 in s) # False
s.remove(0) # {4, 2}
# if the element to be removed does not exist, nothing happens

Solution - Distinct Numbers

This problem asks us to calculate the number of distinct values in a given list.

Method 1 - Sorted Set

Because sets only store one copy of each value, we can insert all the numbers into a set, and then print out the size of the set.

Warning!

Ignore this section, as Python doesn't implement sorted sets.

Method 2 - Hashset

n = int(input()) # unused
nums = [int(x) for x in input().split()]
distinct_nums = set(nums)
print(len(distinct_nums))

Or we can do this more concisely by skipping the creation of the list, and use a set comprehension directly:

n = int(input()) # unused
distinct_nums = {int(x) for x in input().split()}
print(len(distinct_nums))

However, it is possible to construct a test case that causes the above solution to run in Θ(N2)\Theta(N^2) time, so this solution does not receive full credit.

Hack Case Generator

One way to get around this with high probability is to incorporate randomness; see this comment for more information.

import random
RANDOM = random.randrange(2**62)
def Wrapper(x):
return x ^ RANDOM
n = int(input()) # unused
distinct_nums = {Wrapper(int(x)) for x in input().split()}
print(len(distinct_nums))

Another (less efficient) way is to use strings instead of integers since the hash function for strings is randomized.

n = int(input()) # unused
distinct_nums = set(input().split())
print(len(distinct_nums))

Should I worry about anti-hash tests in USACO?

No - historically, no USACO problem has included an anti-hash test. However, these sorts of tests often appear in Codeforces, especially in educational rounds, where open hacking is allowed.

Method 3 - Sorting

Check out the solution involving sorting.

Maps

Focus Problem – try your best to solve this problem before continuing!

In sorted maps, the pairs are sorted in order of key. As with sorted sets, all primary methods run in O(logN)\mathcal{O}(\log N) worst-case time, where NN is the number of pairs in the map.

In hashmaps, the pairs are hashed to buckets by the key, and as with hashsets, all primary methods run in O(1)\mathcal O(1) expected time under some assumptions about the hash function.

Colloquially, hashmaps are referred to as dicts in python.

d = {}
d[1] = 5 # {1: 5}
d[3] = 14 # {1: 5, 3: 14}
d[2] = 7 # {1: 5, 2: 7, 3: 14}
del d[2] # {1: 5, 3: 14}
print(d[1]) # 5
print(7 in d) # False
print(1 in d) # True

Iterating Over Maps

To iterate over dicts, there are three options, all of which involve for loops. Dicts will be returned in the same order of insertion in Python 3.6+. You can iterate over the keys:

for key in d:
print(key)

Over the values:

for value in d.values():
print(value)

And even over key-value pairs:

for key, value in d.items():
print(key, value)

It's also possible to change the values while iterating over the keys (or over the values themselves, if they're mutable):

for key in d:
d[key] = 1234 # Change all values to 1234

While you are free to change the values in a map when iterating over it (as demonstrated above), it is generally a bad idea to insert or remove elements of a map while iterating over it.

Modifying a Collection (Set, Map, etc.) in the middle of a for-each loop will cause a ConcurrentModificationException. See the following snippet for an example:

Map<Integer, Integer> m = new TreeMap<>();
// m starts as {0: 0, 1: 1, 2: 2}
m.put(0, 0);
m.put(1, 1);
m.put(2, 2);
for (int key : m.keySet()) {
m.remove(key); // ConcurrentModificationException thrown!!
}

One work-around is to use Iterator and the .remove() method to remove elements while looping over them, like in the next code snippet:

Map<Integer, Integer> m = new TreeMap<>();
// m starts as {0: 0, 1: 1, 2: 2}
m.put(0, 0);
m.put(1, 1);
m.put(2, 2);
Iterator<Map.Entry<Integer, Integer>> iter = m.entrySet().iterator();
while (iter.hasNext()) {
int key = iter.next().getKey();
if (key == 0 || key == 2) { iter.remove(); }

However, Iterator is outside the scope of this module.

The easiest option (in most cases) if you want to remove/insert multiple entries at once is to use your Container's .addAll(c) or .removeAll(c) methods. That means that you should put all the elements you want to remove (or add) in a new Collection, and then use that new Collection as the parameter of the .addAll(c) or .removeAll(c) method that you call on your original Collection. See the following code snippet for an example (it works equivalently to the code above):

Map<Integer, Integer> m = new TreeMap<>();
// m starts as {0: 0, 1: 1, 2: 2}
m.put(0, 0);
m.put(1, 1);
m.put(2, 2);
Set<Integer> keysToRemove = new TreeSet<>();
for (Map.Entry<Integer, Integer> entry : m.entrySet()) {
int key = entry.getKey();
if (key == 0 || key == 2) { keysToRemove.add(key); }

Solution - Associative Array

To solve this problem efficiently, we need a data structure that can:

  • Assign a value to any index k (where k can be as large as 101810^{18}).
  • Retrieve the value at any index k quickly.

A regular array won't work because the indices can be extremely large, making it impossible to allocate enough memory. However, since all values are initially 0 and only a small subset of indices will ever be set or queried, we can use a map (also called an associative array or dictionary) to store only the indices that have been assigned a value.

  • When we receive a 0 k v query, we set a[k] = v in the map.
  • When we receive a 1 k query, we print a[k] if it exists in the map, or 0 otherwise.

This approach is efficient because both operations are fast in maps, and we only store the keys that are actually used.

Note that we use 64-bit integers since kk and vv may be large.

Unfortunately, the straightforward solution fails a few tests specifically designed to make Python dicts run slowly:

a = dict() # Dictionary to store assigned indices
for _ in range(int(input())):
nums = list(map(int, input().split()))
if nums[0] == 0:
a[nums[1]] = nums[2]
elif nums[0] == 1:
# Print a[k] if present, else 0
print(a.get(nums[1], 0))

To pass all the tests, we can use one of the workarounds mentioned for set.

import random
RANDOM = random.randrange(2**62)
def wrap(x):
return x ^ RANDOM
a = dict() # Dictionary to store assigned indices

Problems

Some of these problems can be solved by sorting alone, though sets or maps could make their implementation easier.

StatusSourceProblem NameDifficultyTags
CSESEasy
Show TagsMap
BronzeEasy
Show TagsSet
BronzeNormal
Show TagsSet, Simulation
BronzeNormal
Show TagsMap
BronzeNormal
Show TagsMap, Sorting
BronzeNormal
Show TagsMap, Set
SilverNormal
Show TagsMap
CFNormal
Show TagsPrefix Sums, Set
BronzeHard
Show TagsMaps, Set
ACHard
Show TagsMap
CFHard
Show TagsMap, Set

Quiz

What is the worst-case time complexity of insertions, deletions, and searches in a set of size NN?

Question 1 of 7

Module Progress:

Join the USACO Forum!

Stuck on a problem, or don't understand a module? Join the USACO Forum and get help from other competitive programmers!

PrevNext