前言

原本是要让 AI 来玩狼人杀的,后来感觉思路不是很清晰,所以干脆今天就先做个单人游戏——扫雷,先给大家看一下成果,完整代码见 https://github.com/zong4/AILearning。

游戏规则

这里先给不了解游戏规则的人介绍一下游戏,扫雷中的每个数字代表的是,以此数字为中心的九宫格内有几颗地雷,如下图就表示 A~H 这八个格子里有一颗雷。

扫雷逻辑

OK,那知道了游戏规则后,我们怎么来表示这个 Knowledge Base 呢?当然你可以像昨天说的那样用以下的式子来表示。

1
Or(A, B, C, D, E, F, G, H)

但是,别忘了,上面的式子只能表达有至少一颗雷,但是没有表达出最多也只有一颗雷,具体还需要下面这一堆才行。

1
2
3
4
5
6
7
8
9
10
Or(
And(A, Not(B), Not(C), Not(D), Not(E), Not(F), Not(G), Not(H)),
And(Not(A), B, Not(C), Not(D), Not(E), Not(F), Not(G), Not(H)),
And(Not(A), Not(B), C, Not(D), Not(E), Not(F), Not(G), Not(H)),
And(Not(A), Not(B), Not(C), D, Not(E), Not(F), Not(G), Not(H)),
And(Not(A), Not(B), Not(C), Not(D), E, Not(F), Not(G), Not(H)),
And(Not(A), Not(B), Not(C), Not(D), Not(E), F, Not(G), Not(H)),
And(Not(A), Not(B), Not(C), Not(D), Not(E), Not(F), G, Not(H)),
And(Not(A), Not(B), Not(C), Not(D), Not(E), Not(F), Not(G), H),
)

光是一个 1 就这么复杂,那怎么行,所以我们得想一个新的表达方式。

知识表示

思考一下,扫雷其实说白了就三种操作。

  1. 如果数字正好等于未翻格子的数量,则代表那些全是雷。
  2. 如果数字为0,则代表周围没有雷。
  3. 切割子集。

第一条和第二条都很好理解,想象一下中间数字是8和0的情况就行了。

至于第三条,我们可以通过经典的12排雷给大家解释。

如下图,通过上面的1我们可以知道,ABC 中一定有一雷,又通过下面的2我们知道 ABCDE 中一定有两颗雷,所以我们可以得出 DE 中一定有一颗雷,这就是切割子集。

OK,那我们该用什么样的数据结构来表示呢?

相信聪明的朋友已经发现了,我在上面一直在说几何集合,没错,我们可以用集合来表示,具体表示形式如下。

在上面的例子中,我们可以说 {A, B, C} = 1,{A, B, C, D, E} = 2,然后得出 {D, E} = 1,也就是说我们有如下公式。

set1 = number1,set2 = number2,set2-set1 = number2-number1。

代码

运行代码

运行代码就给大家稍微带一下。

一块的话是画图的代码,每帧更新,没什么好说的。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
while True:
...

# Draw board
cells = []
for i in range(HEIGHT):
row = []
for j in range(WIDTH):

# Draw rectangle for cell
rect = pygame.Rect(
board_origin[0] + j * cell_size,
board_origin[1] + i * cell_size,
cell_size, cell_size
)
pygame.draw.rect(screen, GRAY, rect)
pygame.draw.rect(screen, WHITE, rect, 3)

# Add a mine, flag, or number if needed
if game.is_mine((i, j)) and lost:
screen.blit(mine, rect)
elif (i, j) in flags:
screen.blit(flag, rect)
elif (i, j) in revealed:
neighbors = smallFont.render(
str(game.nearby_mines((i, j))),
True, BLACK
)
neighborsTextRect = neighbors.get_rect()
neighborsTextRect.center = rect.center
screen.blit(neighbors, neighborsTextRect)

row.append(rect)
cells.append(row)

...

还有一块的话是 AI 的扫雷步骤,可以看到是先查找有没有安全的格子,没有的话就随机选。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
elif left == 1:
mouse = pygame.mouse.get_pos()

# If AI button clicked, make an AI move
if aiButton.collidepoint(mouse) and not lost:
move = ai.make_safe_move()
if move is None:
move = ai.make_random_move()
if move is None:
flags = ai.mines.copy()
print("No moves left to make.")
else:
print("No known safe moves, AI making random move.")
else:
print("AI making safe move.")
time.sleep(0.2)

# use flags to mark mines
for i in range(HEIGHT):
for j in range(WIDTH):
if (i, j) in ai.mines:
flags.add((i, j))

逻辑代码

OK,接下来看看大头逻辑代码。

首先是棋盘类,主要的话就是要判断格子上是不是雷已经它周围雷的数量,也就是要显示的数字。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
class Minesweeper():
"""
Minesweeper game representation
"""

def __init__(self, height=8, width=8, mines=8):

# Set initial width, height, and number of mines
self.height = height
self.width = width
self.mines = set()

# Initialize an empty field with no mines
self.board = []
for i in range(self.height):
row = []
for j in range(self.width):
row.append(False)
self.board.append(row)

# Add mines randomly
while len(self.mines) != mines:
i = random.randrange(height)
j = random.randrange(width)
if not self.board[i][j]:
self.mines.add((i, j))
self.board[i][j] = True

# At first, player has found no mines
self.mines_found = set()

def is_mine(self, cell):
i, j = cell
return self.board[i][j]

def nearby_mines(self, cell):
"""
Returns the number of mines that are
within one row and column of a given cell,
not including the cell itself.
"""

# Keep count of nearby mines
count = 0

# Loop over all cells within one row and column
for i in range(cell[0] - 1, cell[0] + 2):
for j in range(cell[1] - 1, cell[1] + 2):

# Ignore the cell itself
if (i, j) == cell:
continue

# Update count if cell in bounds and is mine
if 0 <= i < self.height and 0 <= j < self.width:
if self.board[i][j]:
count += 1

return count

def won(self):
"""
Checks if all mines have been flagged.
"""
return self.mines_found == self.mines

而 Sentence 类的话其实就是对应了一个集合,所以它需要实现知识表示里的前两种操作。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
class Sentence():
"""
Logical statement about a Minesweeper game
A sentence consists of a set of board cells,
and a count of the number of those cells which are mines.
"""

...

def known_mines(self):
"""
Returns the set of all cells in self.cells known to be mines.
"""
if len(self.cells) == self.count:
return self.cells
return set()

def known_safes(self):
"""
Returns the set of all cells in self.cells known to be safe.
"""
if self.count == 0:
return self.cells
return set()

def mark_mine(self, cell):
"""
Updates internal knowledge representation given the fact that
a cell is known to be a mine.
"""
if cell in self.cells:
self.cells.remove(cell)
self.count -= 1

def mark_safe(self, cell):
"""
Updates internal knowledge representation given the fact that
a cell is known to be safe.
"""
if cell in self.cells:
self.cells.remove(cell)

最后就是我们的 AI Agent 了,主要就是去记录一些安全格和雷格的信息,从而进行安全的移动。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
class MinesweeperAI():
"""
Minesweeper game player
"""

def __init__(self, height=8, width=8):

# Set initial height and width
self.height = height
self.width = width

# Keep track of which cells have been clicked on
self.moves_made = set()

# Keep track of cells known to be safe or mines
self.mines = set()
self.safes = set()

# List of sentences about the game known to be true
self.knowledge = []

def mark_mine(self, cell):
"""
Marks a cell as a mine, and updates all knowledge
to mark that cell as a mine as well.
"""
self.mines.add(cell)
for sentence in self.knowledge:
sentence.mark_mine(cell)

def mark_safe(self, cell):
"""
Marks a cell as safe, and updates all knowledge
to mark that cell as safe as well.
"""
self.safes.add(cell)
for sentence in self.knowledge:
sentence.mark_safe(cell)

...

def make_safe_move(self):
"""
Returns a safe cell to choose on the Minesweeper board.
The move must be known to be safe, and not already a move
that has been made.

This function may use the knowledge in self.mines, self.safes
and self.moves_made, but should not modify any of those values.
"""
for safe_cell in self.safes:
if safe_cell not in self.moves_made:
return safe_cell
return None

def make_random_move(self):
"""
Returns a move to make on the Minesweeper board.
Should choose randomly among cells that:
1) have not already been chosen, and
2) are not known to be mines
"""
possible_moves = set()
for i in range(self.height):
for j in range(self.width):
if (i, j) not in self.moves_made and (i, j) not in self.mines:
possible_moves.add((i, j))
if possible_moves:
return random.choice(list(possible_moves))
return None

而其中负责增加新信息的主要这个 add_knowledge(self, cell, count) 函数,这里的 cell 就是当前新格子周围的未翻开格子了,count 就是周围雷的数量,那这个函数具体干了哪些事呢?

  1. 记录并标记当前格子未安全。
  2. 添加新信息,也就是 cell = count,同时根据已知的安全格缩小 cell。
  3. 根据新的安全格,更新 KB。
  4. 通过新集合,划分子集。
  5. 再次更新 KB。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
def add_knowledge(self, cell, count):
"""
Called when the Minesweeper board tells us, for a given
safe cell, how many neighboring cells have mines in them.
"""

self.moves_made.add(cell)
self.mark_safe(cell)

# Add new sentence to knowledge base
new_sentence_cells = set()
for i in range(cell[0] - 1, cell[0] + 2):
for j in range(cell[1] - 1, cell[1] + 2):
if 0 <= i < self.height and 0 <= j < self.width:
if (i, j) in self.mines:
count -= 1
elif (i, j) not in self.safes:
new_sentence_cells.add((i, j))
new_sentence = Sentence(new_sentence_cells, count)
self.knowledge.append(new_sentence)

# Mark additional cells as safe or as mines
for sentence in self.knowledge:
for safe_cell in sentence.known_safes():
self.safes.add(safe_cell)
for mine_cell in sentence.known_mines():
self.mines.add(mine_cell)

# Add new sentences to knowledge base
for sentence1, sentence2 in itertools.combinations(self.knowledge, 2):
if sentence1.cells.issubset(sentence2.cells):
new_sentence = Sentence(sentence2.cells - sentence1.cells, sentence2.count - sentence1.count)
elif sentence2.cells.issubset(sentence1.cells):
new_sentence = Sentence(sentence1.cells - sentence2.cells, sentence1.count - sentence2.count)
if new_sentence not in self.knowledge:
self.knowledge.append(new_sentence)

# Remove empty sentences
self.knowledge = [sentence for sentence in self.knowledge if sentence.cells]

# Mark additional cells as safe or as mines
for sentence in self.knowledge:
for safe_cell in sentence.known_safes():
self.safes.add(safe_cell)
for mine_cell in sentence.known_mines():
self.mines.add(mine_cell)

# Mark safe cells and mines
for safe_cell in self.safes:
self.mark_safe(safe_cell)
for mine_cell in self.mines:
self.mark_mine(mine_cell)

后记

其实读到这里大家应该发现了,这扫雷写起来也不难,难的是你有没有想到用这样的数据逻辑来解释他。