For JOHN OLLE

From: loeb@geocub.UUCP (Daniel LOEB)

Date: Mon, 24 Jun 1991 09:44:04 +0000

I'm having trouble sending eMail to John Olle (olle@sheoak.bcae.oz).
Could you pass this along if you know him.

Yes, indeed there are people who work on Diplomacy on the computer.

The simplest question is computer automated adjudication. This has
been solved in a very satisfying manner by Ken Lowe
(jdr@u.washington.edu). For info about his program, send the message
        HELP
to judge@u.washington.edu.

The hard question is getting the computer to play the game. That is
the goal of the Diplomacy programming project.

Diplomacy Programming Project
table.of.contents
Updated: 25 April 1991

Thanks for your interest in the Diplomacy Programming Project. The
best way I can tell you more about it, is to send you a sample of some
of the files relating to the project.... (The files I didn't send, you
can request directly from me).

name            topic
        article.tex     A better version of "hall.article" (in progress)
        conversations   An appendix to "protocol" giving examples of syntax
        course          How to get credit for your participation
        dipl.sml        The Diplomat Interface progress written in LCS
                        Program is in many parts including:
                datatypes.sml   declarations
                dipl.bugs       list of bugs to fix
                dipl.comments   general comments about the program
                draw.sml        handling of the DRW command
                endofturn.sml   production and distribution of results
                general.sml     a package of general purpose functions
                init.sml        initialisation routines
                map             the standard diplomacy map
                map.sml         the parts of init.sml which depend on the map
                process.sml     the "front-end" of the program
                rules           AH rulebook with commentary regarding DPP
                strings.sml     string "beautification" package
                submit.sml      handling of the SUB command
        hall.article    Article by Michael Hall on programming a diplomat
        intro           First proposal for DPP
        poker           Poker Programming Project version of article.tex
        protocol        The language in which Diplomats communicate
        snake           Rules for a simple multiplayer game
        strategy        A simple diplomacy strategy
        table.of.con... This letter
        update.sep1,2,3 progress reports
        update.oct1,2,3 progress reports
        update.nov1,2,3 progress reports

Please tell me if you want me to put you on the mailing list for
future reports. I hope you decide to join us, and possibly write a
program for use with the project.

(See below for art.tex - latex format)

Yours, Daniel Loeb, DPP
"Combinatorics: The Thought that Counts"

EMAIL   loeb@geocub.greco-prog.fr (or else try loeb@frbdx11.bitnet)
HOME    150, cours Victor-Hugo; Appt B21, 33000 Bordeaux France
WORK    Departement des Maths et d'Info, Universite de Bordeaux I,
        351 cours de la Liberation, 33405 Talence Cedex France
PHONE(H)+(33) 56 31 48 26, (W)+(33) 56 84 60 88, (FAX) +(33) 56 80 08 37

\documentstyle[twoside,titlepage]{article}
\setlength{\textwidth}{6.7in}
\setlength{\evensidemargin}{0.0in}
\setlength{\oddsidemargin}{0.0in}
\setlength{\parindent}{1cm}
\newcommand{\seper}{\nopagebreak \begin{center}
        \underline{\hspace{2in}}
        \end{center}}
\newcommand{\MovesTo}{\mbox{$\rightarrow$}}
\newcommand{\fig}[2]{\begin{table}[htbp]
        \caption{#1}
        \begin{center}
                \fbox{\scriptsize #2}
        \end{center}
\end{table}}
\newcommand{\sw}{\swarrow}
\newcommand{\se}{\searrow}
\newcommand{\do}{|}
\setlength{\parskip}{7pt plus 5pt minus 3pt}
\newcommand{\tictactoe}[4]
{\mbox{\tiny \begin{tabular}{@{}c@{}|@{}c@{}|@{}c@{}}
                        \multicolumn{3}{c}{#1}\\
                        #2 \\ \hline
                        #3 \\ \hline
                        #4
                        \end{tabular}}}
\begin{document}
\title{Thoughts on Programming a Diplomat}
\author{Michael R. Hall\\
hall@rocky.bellcore.com \\
Bellcore\\
445 South St., Room 2D-335\\
Morristown, NJ 07960
\and
Daniel E. Loeb\\
loeb@nestor.greco-prog.fr\\     %***
LABRI\\
Universit\'{e} de Bordeaux I\\
33205 Talence, France}
\date{Copyright \today}
\maketitle

\begin{abstract}
In contrast to two-player
sequential move games, multiplayer games with simultaneous move have not
been extensively studied. Diplomacy \cite{Rules} is a multi-player
simultaneous move game that supplies some new challenges
in creating a computer player, namely: the difficulty of determining the
true strength of a position, the tremendous size of the search tree,
and the necessity of negotiation. The Diplomacy Programming Project
presents here some of its ideas for overcoming these problems.

{\bf Premi\`{e}re \'{e}tude sur la programmation d'un diplomate}

Contrairement aux jeux \`{a} deux joueurs dans lesquels chacun joue
\`{a} tour de r\^ole, on n'a jamais vraiment \'{e}tudi\'{e} les
jeux \`{a} plusieurs joueurs jouant chaque coup simultan\'{e}ment. Le
jeu de {\em Diplomacy} \cite{Rules} est un exemple type de cette derni\`{e}re
categorie de jeux. Il est particuli\`{e}rement interessant de
l'\'{e}tudier en vue de l'elaboration d'un ``diplomate informatique''
: il faut en effet pour cela d\'{e}terminer quelle est numeriquement
la veritable force d'une position, cela \`{a} l'interieur d'un arbre
de recherche gigantesque. Il faut d'autre part accorder un r\^ole
essentiel \`{a} la n\'{e}gotiation. Nous r\'{e}pondons ici, dans le
cadre du {\em Diplomacy Programming Project}, \`{a} un certain nombre
de ces questions.
\end{abstract}

%\tableofcontents

\section{Introduction}

The theory of two player games is beginning to become well understood.
On the theoretical front, we have such works as {\em Winning Ways}
\cite{WinningWays}.
Meanwhile, on a more practical level advances in game theory and artificial
intelligence have given birth to computer programs which play such
games as chess, checkers, and othello amazingly well.

In a multi-player game (and in particular the game diplomacy), the
results are nowhere as clear. There are three issues at stake here:
\begin{itemize}
\item First, the ``paradox'' that in a multiplayer
game, then best move is not always one that gives you the ``strongest''
position. As a example of this paradox, consider the following game
involving three players.
\begin{quote}
The three players A, B, and C are involved in a gunfight. They are to
take turns shooting one bullet until only one player remains. C is a
perfect shooter; he kills his target every time. B has an accuracy of two
kills in three attempts. Finally, A only hits his target one out of
every three times. To be fair, it is decided that A should begin the
gunfight, to be followed by B, and then C (if they are alive), and so
on.
\end{quote}
It is amazing but true that the best strategy for player A is to
purposely miss. If by chance, he hits one of his opponents, then he now
becomes the target for the remaining one. By keeping both opponents
alive, he guarantees himself a chance to win on the second round!

Similarly, in Diplomacy \cite{Rules}, the goal is to accumulate a
total of eighteen {\em supply centers}, yet it is a general maxim
that Germany should not attempt to
capture three supply centers during the first couple of turns. By
doing so, he would
threaten too many other countries. Germany is thus in a stronger
position if he is ``weaker'' by one unit. He thus grows at a decent
rate while not provoking any attacks.

As a result of this paradox, the evaluation of the strength of a
Diplomacy position is no longer as straight-forward as it is in a game
such as chess. In chess, white never avoids taking black's queen
for fear of black ``ganging up on him.'' However, in Diplomacy,
Germany often decides to avoid taking {\em  Belgium} for fear of the
consequent formation of a coalition against him.

\item The other issue involved in all multi-player games is that of
negotiation. It is this facet of the game which gives Diplomacy its
name. No player can hope to win a game of Diplomacy by remaining
silent. He only owns one-seventh of the force on the board. Even if
his strategy is superior to the other players, he could not win against
a force six times greater than his own.

The analogy in chess would be a grandmaster starting a game of chess
with only a knight and three pawns. The grandmaster's advantage in
ability would not be enough to overcome the corresponding lack of
material.
\item Finally, Diplomacy is distinguished among all multi-player games by
the fact that all players submit moves for each of their units
simultaneously. Since each of the seven players begins with at least three
units, and each unit has at least four possible legal moves, a quick
computation reveals that even a ``one-ply'' search is impossible.

On the other hand, in chess there is no position with over 203
possible next-moves \cite{Chess}. In fact, most common positions have
one or two dozen possible continuations. Thus, it is fairly
easy to execute a extensive search of reasonable depth.
\end{itemize}

The Diplomacy Programming Project
seeks to create automated diplomacy players, or {\em
diplomats}.\footnote{Short for ``Diplomacy
automata''} A program called the {\em Diplomacy Interface} has been created
to provide a common environment for diplomats, so that they may
communicate with each other and have their moves carried out.
We are currently designing actual diplomats. In so doing, we are
pressing simultaneously on three fronts of computer game-playing:
First, positions must be evaluated using new techniques; next,
search techniques must be investigated to narrow down the possible options;
finally, negotiation heuristics must be devised.

In some sense, the game Diplomacy provides a vehicle to study
the interactions among intelligent agents - be they human or computer.
Issues of negotiation, cooperative/antagonistic behavior, problem-solving,
learning, tactics, strategy and so on all arise in this rich environment,
in sharp contrast to one person or two person games with no mechanism
for communication outside the game board.  The {\em Diplomacy Interface}
can be used by humans as well as computer diplomats, and we thus have a
laboratory where theories of human and machine intelligence may be tested,
against each other if desired.  This means a limited form of the Turing test
is also possible.

Before delving into the problem of building diplomats, we
present a short review of the rules of the game Diplomacy.

\section{Rules}
\subsection{Basic Rules}

In 1959, Allan Calhammer created Diplomacy with an eye towards recreating
the political intrigue of World War I Europe. It is played on a
map of Europe (see Table \ref{fig1})
\fig{Diplomacy Map}
{\label{fig1} \rule{0cm}{7.5cm} \rule{11cm}{0cm}}
which has been divided into {\em regions} (called {\em
seas} and {\em provinces}).
The seven players represent the great
powers of World War I (Austria, England, France, Germany, Italy,
Russia, and Turkey). Each starts with three home cities known as {\em supply
centers},\footnote{Except Russia, which has four.} and an equal number
of {\em
units} (either {\em armies} or {\em fleets}). Certain {\em neutral}
supply centers
are unowned at the beginning of the game.\footnote{All supply centers
are marked by a $\bullet$ in Table \ref{fig1}.}

At the end of each {\em year} of play, the number of units is again
compared to the number of supply centers, and units are {\em built} or
{\em removed} in order to bring the two equal. Thus, by capturing
territory a player can become more powerful. A player wins by
capturing 18 supply centers which represents a little more than half
the board. The game can also end at any time with a draw by the mutual
consent of the remaining players.

Each year is divided into five {\em seasons}: {\em Spring} movement,
{\em Summer} retreat, {\em Fall} movement, {\em Autumn} retreat, and
{\em Winter} build. The first turn on the game is considered Spring
1901.

\subsection{Submitting Orders}

The heart of the game lies in the movement
seasons. Each movement season is preceded by fifteen minutes of
negotiations among the players during which the players devise
their strategies. Each player then writes orders secretly for all of
his units. Each unit can do one of the following actions:
\begin{description}
\item [Move] to an adjacent province with the restriction that armies
can not enter seas, and that fleets can not move inland and must
follow the coasts when moving between coastal provinces.
\item [Hold] in place.
\item [Support] a unit attempting to remain in or move to a
neighboring province on the condition that the neighboring province is
accessible by the supporting unit (see the restrictions on movement
above for details).
\item [Convoy:] An army can move between any two coastal provinces on
the condition that there is intervening path of fleets at sea which
are ordered to make the convoy.
\end{description}
In this article, all diplomacy orders, units, and regions are
written in italics.  The symbols {\em A, F, S, C, H,} and
{\em \MovesTo} respectively
are used to represent armies, fleets, supports, convoys, holds, and
moves to. For example, {\em A Paris S F Brest\MovesTo{}Picardy}.

\subsection{Resolving Conflicts}

After all moves are ready, moves are read and the results are computed
following a number of simple rules:
\begin{itemize}
\item All moves are successful except as noted below.
\item There are three types of conflicts:
\begin{itemize}
\item When two units attempt to switch places (without being convoyed).
\item When two units attempt to enter the same region.
\item When a unit attempts to enter a region whose occupant has not
successfully moved out.
\end{itemize}
When equally supported units participate in a conflict, they remain in
their original regions.
\item In case of unequal support of units participating in a conflict,
only the best supported unit succeeds in its order.
\item If a unit attempts to remain in its region (or fails to leave
it) while another unit succeeds in entering that region, then the
first unit is {\em dislodged} and must retreat.
\item If a fleet in a convoy is dislodged, then the convoy has no effect.
\end{itemize}

Additional rules not particularly relevant to this article cover under
what circumstances a support can be {\em cut}.

During the retreat seasons, orders are given secretly and
simultaneously for all dislodged units without any preceding
negotiation period. Retreats are made according to the same rules as
ordinary moves except that one can not retreat to an occupied region,
the region the attacker came from, or a region in which two equally
well support units {\em bounced}. There is the possibility of {\em
disbanding} instead of
retreating. If two units are ordered to retreat to the same region,
then they are both disbanded. Note that any unit disbanded would then
be replaced during the upcoming Winter turn if the player retained the
same number of supply centers.

The Winter builds immediately follow the Autumn retreats without any
intervening negotiations. Players with excess units (in comparison
with their number of supply centers) are obliged to {\em remove} the
excess. Players with surplus supply centers can construct new units in
their empty {\em home} supply centers (unless another player owns
those supply centers).

\section{Search Techniques}
\subsection{Game Trees}

It is difficult to program intelligent machines even in
circumscribed domains such as Diplomacy. However, we need not throw up
our hands in despair; the fields of game theory and artificial
intelligence (A.I.) have at least some of the
answers.\cite{WinningWays,AI1,AI2,AI3}

Nearly all A.I. or game-playing programs search. Search is merely the
exploration of different alternatives. Humans search, but very slowly
and with many mistakes. For example, in the Diplomacy game
``Tokugawa'' \cite{Tok}, when
the authors---playing Austria and Turkey---were at war, Turkey was
surprised
when Austria forced him to disband one of his armies capable of
gravely harming Austria; Turkey had not foreseen the move
because he had not done a thorough enough search of Austria's possible moves.
Computers, on the other hand, have the ability to search quickly
and as completely as desired.

There are two main search paradigms: {\em breadth-first} and {\em
depth-first}.
With breadth-first search, one examines all the nodes at the same level at
the same time. Depth-first search involves going deep as quickly as possible.
In a game, the nodes of the tree represent possible states. These
states are reached by making moves (which are the arcs connecting the
nodes.)

What are we searching for? An
evaluation function is needed to evaluate the value of the resulting
positions. For example, the value of the final position in tic-tac-toe is easy
to analyze---if you have three in a row, you win; if your opponent
has three in a row, you lose; otherwise, when all the spaces are filled,
it is a tie. These final position evaluations can be rolled back
assuming that $O$ plays optimally and tries to minimize
$X$'s score.

Unfortunately, for nontrivial problems, it is impossible to search
completely within a reasonable amount of time. Chess is an example of
a game that cannot be solved completely because each piece has so many
so options, and there are so
many turns in a chess game. The combinatorial explosion that occurs
during the search is potentially deadly; even on the world's fastest
hardware it would take longer than the history of the universe to
``solve'' chess.

Because the units move simultaneously in Diplomacy,
analyzing Diplomacy completely is even more hopeless! Imagine
trying to evaluate all the possible opening moves for all the players
simultaneously---since each unit has at least four possible orders,
there are well over $4^{(3*6+4)}= 17,592,186,044,416$ possible sets of
orders for
Spring 1901,\footnote{The exact number of openings is
4,430,690,040,914,420 (not counting openings in which useless support
is given) \cite{Green}.} so we can not even analyze the first turn
completely, at
least not within a reasonable amount of time on today's hardware.
We are again confronted with two of the problems mentioned in the
introduction. First, we must find ways to reduce
the amount of searching, and second, we must find a position evaluation
function that does not depend on searching until the end of the game.

\subsection{Best-First Search}

Leaving the problem of evaluation aside until Chapter \ref{PE},
assume we have a human available to evaluate the
board positions, and focus on the search itself. There are a few
techniques to speed up the search. First of all, obviously you should
only
look at your units and nearby units of other players. If you want to
look at the whole board, then do it in pieces (divide and conquer).
Second, you can also use a combination of breadth-first and depth-first
search called {\em best-first} search.

The idea of best-first search is that if you keep looking in what appears
to be right direction at any given time, then you will find the best set
of orders more quickly. Generate all the valid orders for each the
unit to be ordered pruning out any illegal or stupid
orders.\footnote{Practically speaking, we can avoid generating illegal
orders. However, stupid orders are harder to avoid. A certain number
of them can be avoided by the use of an ``expert system'' weeding out
sets of orders violating certain maxims (such as ``Never cut your own
support.'') (See section \ref{expert}.) However, a certain number of
stupid orders remain in our search to be examined and rejected by
the position evaluation routine.} Next, evaluate
the partial position generated by each of the orders. Only one unit is
ordered at a time;  the other units are
assumed to hold for the moment. (See section \ref{ss} for other
possibilities here.) Choose the
order that seems to produce the best position, and fan out to the
subsequent partial positions. Then choose the partial position that seems
best---this may be in a completely different part of the search tree
than you were last working on, and so on.

To implement best-first search, only
the current leading edge (i.e., the leaves) of the tree need to be
stored in a list. This list is called an {\em agenda}. More formally,
best-first search may be implemented as follows:

\begin{itemize}
\item[1.] Initialize the agenda.
\item[2.] Loop:
\begin{itemize}
\item     Take the first node on the agenda.
\item     If the node is a goal node, then return it or store it and
keep searching.
\item     If the agenda is empty, then stop.
\item     Get the successors of the node and evaluate them.
\item     Insert the successors into the agenda and keep agenda sorted
by position value.
\item     Continue looping.
\end{itemize}
\end{itemize}
An example search is given in section \ref{exSec}.

\subsection{Even Better Best-First Search}

Of course, this best-first approach cannot be applied directly to analyzing
the effects of the enemy's moves. You could take the paranoid
conservative approach of evaluating a complete set of orders according
to the worst position that can be obtained by some combination of enemy
orders. Or better you could assume that the enemy is attempting to maximize
his own position given your orders, which does not necessarily undermine
your position.\footnote{Since Diplomacy is a multi-player game, it is not
necessarily zero sum. That is to say, cooperative behavior is
possible.}

In other words, in order to generate your best orders, it is helpful
to first compute good moves for the other players. To do so,
you may wish to run seven
processes in parallel (one for each country), which each try to devise
orders against the best current orders of the others. The processes
would communicate the current ``best-move-being-considered'' to each
other on rendezvous. Of course, the best move being
considered for your own country would also be sent into the ``game
master'' every time it
changed. By so doing, you are guaranteed to have submitted the best
move so far discovered by the deadline.

The process associated with each country would occasionally stop and
demand the
results of the other processes. The search would then be continued if
these moves were already those being used as the ``background'' for
the initial search. In the more likely case that one of the other
countries is now considering a new set of moves, we restart the search
using our last set of moves as our {\em seed}. (See section~\ref{ss}.)

If you can come
up with a number of alternative sets of enemy orders, then you can
evaluate a candidate set of orders according to how well it does on
average given these various sets of enemy orders. This should result
in a very robust set of orders that would work well in a wide variety
of situations, but would also allow the diplomat to take some worthwhile
risks.

Best-first search will not necessarily arrive at a complete set of orders
quickly, so you may wish to modify it slightly to {\em first} do a form of
depth-first search, choosing the best position at the current level to
expand (but always moving deeper and never jumping to a different
portion of the search tree). Thus, it will rapidly find a set of
mediocre orders to turn in just in case the rest of the search
bogs down and fails to terminate within the available time. After this
set of orders is found, it switches to the normal best-first search.

One way to optimize best-first search is to have it prune redundant
positions from the search tree. In Diplomacy, this is accomplished
by sorting each set of examined orders in some way and then caching
them. Any time the diplomat starts to evaluate a ``new'' set of
orders, it first checks the cache to make sure that it is not
simply a permutation of a set of orders that have already been
evaluated. This results in an exponential time savings with no
loss of completeness, since whole redundant subtrees get pruned.

Another approach to speed up best-first search is to modify it to
become a {\em beam search}, where a limit is put on the maximum number of
nodes in the agenda at any one time. The worst positions are discarded
when the beam limit is exceeded.

Given enough search time, we will include Summer retreats in our search for
Spring moves, and the Autumn retreats and Winter builds in our search
for Fall moves. This is feasible since the number of possible retreats
and builds is usually quite limited; we are not facing the same sort
of combinatorial explosion described above. Moreover, there are many
strategies that can only be seen by looking ahead that far.

One such look-ahead strategy is called the {\em offensive retreat}, which
takes advantage of the fact that a unit can often retreat ``forwards.''
For example, if England
attacks the French {\em F Belgium} with support from {\em Holland} while
the {\em North-Sea} is open, France can retreat there and menace the entire
East coast of England as well as Scandinavia and the Lowlands.
It is equally important to look ahead at the builds, if
Austria-Hungary is at war with Turkey and wants to build a fleet, then
he should take care during the Fall movement that {\em  Trieste} is
left open for such a build.

Even if anticipating retreats and builds were not needed for strategic
reasons, they are clearly needed on a diplomatic level. Diplomacy is
not allowed before retreat and build season. Thus, any message
concerning those seasons should be anticipated and sent to allies
before the preceding movement season.

However, looking ahead to future movement seasons will often be
futile, because the computer is
likely to be wrong about the
opponents' moves and the mistakes will get compounded for each turn
looked ahead. Nevertheless, it is certainly worth an attempt if the
computer has identified two or more near optimal sets of orders and
no other orders at the current turn seem worth exploring.

\subsection{Seeds}\label{ss}
So far we have focused mainly on step two of the best-first search:
the ``loop.'' However, the initialization of the agenda is also of
primary importance. The agenda must start off with a
set (or possibly several sets) of
orders for all countries which serve as the default for each
unit. Possible changes to these orders are considered as the
search progresses.

However, a great deal of
time can be saved by choosing a good set of initial orders. In theory,
we have no idea where to start. For this reason, the example
considered below in section \ref{exSec} is arbitrarily
initialized with the possibility of all units holding. From this
starting point, a reasonable set of orders is devised.

In practice,  we can do better sometimes. For example, it is more
reasonable to assume that units which remain in the same position as last
turn will repeat their movement, than to assume that they will hold.
Moves suggested, promised, or threatened by other players can be
easily tested by using those moves as a seed.
Moreover, as mentioned above, while
computing the orders for one country, we use the results of our
computations for other countries as a background along with the
previous results in the most recent calculation for that country.
As we will see below, predictability can be avoided by choosing a
random seed, and conjectured alliances can be verified by using the actual
moves of the previous turn as a seed and measuring its stability.

\section{Position Evaluation} \label{PE}
\subsection{Simple Model}

A static position evaluator is a necessity for a diplomat. The are
many issues at stake in the design of such an evaluator, so the
following describes merely our thoughts on the subject.
We assume that the
diplomat has a module that knows the rules of Diplomacy well, if not
perfectly. A position and a set of orders is passed into this module,
and it produces a new position. Then this position must be evaluated.

Suppose we are trying to develop a position evaluation function for
Turkey. Now imagine a mountain with its peak in Turkey, sloping down
to the Balkans, Austria, and Russia, and then down further to Italy and
down further to France, maybe a bit higher through the Mediterranean and
around Iberia and the critical Straights of Gibraltar, and then down
further to Germany, England, and Scandinavia. The position evaluation at
any given time could be the sum of the ``altitudes'' of the territories
controlled by Turkey. This means that Turkey is very, very valuable to Turkey,
which is what we would hope. However, this does not mean that Turkey
leaves all her units there; she still controls the mountain
peak when she leaves it to take the lower ground provided no
enemy sneaks through the front lines.  Therefore, sets of orders
that expand Turkey's influence to other high ``altitude'' areas are
generally preferred over ones that would expand to low ``altitude'' areas.

Here we will give an example of an ``influence'' measure.
If a player has a unit in a territory, it controls that territory. Otherwise,
if a territory is vacant, the nearest units
share in the control of it. For example, if the closest units to
{\em Bulgaria} are two regions away, say, a Russian {\em F
Sevastopol}, a
Turkish {\em A Budapest} and Turkish {\em F Ionian-Sea}, then
Turkey gets two-thirds of how much she values {\em  Bulgaria}, while
Russia gets
one-third of how much she values {\em  Bulgaria}. The theory behind
this is that
if there were suddenly a race for the territory, then all the units
would get there at about the same time. Distance should take into
account the accessibility of territories to armies/fleets; thus, for
example, any army is infinitely distant from the {\em
Ionian-Sea}.\footnote{These
distances would be calculated one time only, and then stored in the
form of a table which could also serve as an adjacency matrix. Thus,
these distances do not take convoys in account.}

This distance measure of control will cause the diplomat to
desperately try to avoid letting ``raiders'' slip through its front
lines, and to try to slip its own units through enemy lines. However,
since the diplomat considers enemy territory less valuable than his
own, the diplomat will never try to slip a unit through at the expense
of his own security. This is to be desired, since a war without
organized fronts between two
countries can rapidly lead to mutual destruction.

\subsection{Example}\label{exSec}

In practice, ``book openings'' will be stored  to give the program a
little head start, but as an exercise consider the analysis of the
opening moves for Turkey. First, the diplomat adjusts the values of holding
certain territories according to its current foreign policy, the
season, and so on. Let us suppose that the territories have these
values: {\em Constantinople=100, Ankara=90, Smyrna=90, Syria=50,
Armenia=60, Black-Sea=60, Sevastopol=70,
Rumania=50, Bulgaria=70, Greece=60, Aegean-Sea=60,
Eastern-Mediterranean=50, Ionian-Sea=50}. (See Table~\ref{figT}.)
\fig{Turkey and its Neighbors}{\label{figT} \rule{0cm}{4.25cm}
\rule{6cm}{0cm}}
Next the diplomat begins the best-first phase. Initially, having
chosen ``all units hold'' as our seed, we start a
time $t=0$ with a one element agenda, the null order set (Table~\ref{t0}).
\fig{Agenda $t=0$}
{\label{t0}$  (\{\}=815) $}
In the above notation, the outer parentheses mark the beginning and
end of the agenda, the braces surround the current order set, and
the number after the equal sign is the evaluation of the position
achieved by this order set. The null order set is equivalent to ``all
hold.'' The position evaluation of ``815'' was obtained by summing up the
values of the territories occupied by Turkey plus a proportion of the
values of territories that a partially controlled. (For example,
it gets 40 points for {\em Armenia}, 2/3's of its total's worth of 60, because two
of Turkey's units could move there in one turn, while only one of
Russia's could move there in one turn.) A 100 point
bonus is given for owned supply centers and a 50 point bonus for
opponent/neutral supply centers that are occupied in the Spring.

Next, this element is removed from the agenda, and all the legal
moves for each individual unit are generated, the resulting positions
are evaluated,
and the order sets are sorted (Table~\ref{t1}).
\fig{Agenda $t=1$}
{\label{t1}\begin{tabular}{lcr}
(\{{\em A Constantinople\MovesTo{}Bulgaria}\}&=&959,\\
  \{{\em F Ankara\MovesTo{}Black-Sea}\}&=&870,\\
  \{{\em A Smyrna\MovesTo{}Armenia}\}&=&845,\\
  \{{\em F Ankara\MovesTo{}Armenia}\}&=&845,\\
  \{{\em A Constantinople\MovesTo{}Smyrna}\}&=&815,\\
  \{{\em F Ankara\MovesTo{}Constantinople}\}&=&815,\\
  \{{\em A Constantinople\MovesTo{}Ankara}\}&=&815,\\
  \{{\em F Ankara\MovesTo{}Constantinople}\}&=&815,\\
  \{{\em A Smyrna\MovesTo{}Constantinople}\}&=&815,\\
  \{{\em A Smyrna\MovesTo{}Ankara}\}&=&815,\\
  \{{\em F Armenia S A Constantinople}\}&=&815,\\
  \{{\em F Armenia S A Constantinople}\}&=&815,\\
  \{{\em A Constantinople S F Ankara}\}&=&815,\\
  \{{\em A Constantinople S A Smyrna}\}&=&815,\\
  \{{\em A Smyrna S A Constantinople}\}&=&815,\\
  \{{\em A Smyrna S F Ankara}\}&=&815,\\
  \{{\em A Smyrna\MovesTo{}Syria}\}&=&815,\\
  \{{\em A Constantinople H, F Ankara H, A Smyrna H}\}&=&815)
\end{tabular}}
There is no need to generate support orders for moves before other
units have moved. Also, there is no need to generate hold moves
explicitly in the search, since units are always assumed to hold if
they do nothing else, and holding will not change the position
evaluation.\footnote{Notice the ``all hold'' element at
the end. It inserts this near-equivalent of \{{\em }\}=815, just in
case holding
is actually the best thing to do. If this were to ever get to the
top of the agenda, it would test holding all its units against enemy
orders.} Each of the possible moves is tried
individually and the
resulting partial positions are evaluated. {\em A Constantinople\MovesTo{}Bulgaria} is
extremely
valuable, giving Turkey new influence in valuable areas as well as
occupying a neutral supply center.\footnote{``There is no substitute
for {\em A Constantinople\MovesTo{}Bulgaria}.'' \cite{PG}} {\em F Ankara\MovesTo{}Black-Sea} is good, since it gives
Turkey influence in {\em Rumania}, but it is not as good as {\em A
Constantinople\MovesTo{}Bulgaria,}
especially since it does not grab a supply center. (Remember, the
computer does not do all these rationalizations, at least not at this
time---it just looks to maximize the evaluation function.)

Now, {\em A Constantinople\MovesTo{}Bulgaria} is made as the first step in the search and this order
set is removed from the agenda. Next, all the legal second moves given
{\em A Constantinople\MovesTo{}Bulgaria} are generated and inserted into the agenda.
(See table~\ref{t2}; asterisks mark
the new elements.)
\fig{Agenda $t=2$}
{\label{t2}\begin{tabular}{rlcr}
(*&\{{\em A Constantinople\MovesTo{}Bulgaria,F Ankara\MovesTo{}Black-Sea}\}&=&1014,\\
{} *&\{{\em A Constantinople\MovesTo{}Bulgaria,F Ankara\MovesTo{}Constantinople}\}&=&1004,\\
{} *&\{{\em A Constantinople\MovesTo{}Bulgaria,A Smyrna\MovesTo{}Armenia}\}&=&979,\\
{} *&\{{\em A Constantinople\MovesTo{}Bulgaria,A Smyrna\MovesTo{}Ankara}\}&=&959,\\
{} *&\{{\em A Constantinople\MovesTo{}Bulgaria,A Smyrna S F Ankara}\}&=&959,\\
{} *&\{{\em A Constantinople\MovesTo{}Bulgaria,A Smyrna\MovesTo{}Syria}\}&=&959,\\
{} *&\{{\em A Constantinople\MovesTo{}Bulgaria,F Ankara H, A Smyrna H}\}&=&959,\\
{} *&\{{\em A Constantinople\MovesTo{}Bulgaria,F Ankara\MovesTo{}Armenia}\}&=&949,\\
{} *&\{{\em A Constantinople\MovesTo{}Bulgaria,F Smyrna\MovesTo{}Constantinople}\}&=&939,\\
&  \{{\em F Ankara\MovesTo{}Black-Sea}\}&=&870,\\
&  \{{\em A Smyrna\MovesTo{}Armenia}\}&=&845,\\
&  \{{\em F Ankara\MovesTo{}Armenia}\}&=&845,\\
&  \{{\em A Constantinople\MovesTo{}Smyrna}\}&=&815,\\
&  \{{\em F Ankara\MovesTo{}Constantinople}\}&=&815,\\
&  \{{\em A Constantinople\MovesTo{}Ankara}\}&=&815,\\
&  \{{\em F Ankara\MovesTo{}Constantinople}\}&=&815,\\
&  \{{\em A Smyrna\MovesTo{}Constantinople}\}&=&815,\\
&  \{{\em A Smyrna\MovesTo{}Ankara}\}&=&815,\\
&  \{{\em F Armenia S A Constantinople}\}&=&815,\\
&  \{{\em F Armenia S A Constantinople}\}&=&815,\\
&  \{{\em A Constantinople S F Ankara}\}&=&815,\\
&  \{{\em A Constantinople S A Smyrna}\}&=&815,\\
&  \{{\em A Smyrna S A Constantinople}\}&=&815,\\
&  \{{\em A Smyrna S F Ankara}\}&=&815,\\
&  \{{\em A Smyrna\MovesTo{}Syria}\}&=&815,\\
{} & \{{\em A Constantinople H, F Ankara H, A Smyrna H}\}&=&815)
\end{tabular}}
Here the moves that increase the position evaluation the most are
{\em F Ankara\MovesTo{}Constantinople} and {\em F Ankara\MovesTo{}Black-Sea}, since these are at the top of the
list. The
position with {\em F  Ankara\MovesTo{}Black-Sea} is explored first, since it seems slightly
better. The allowable subsequent orders are:  {\em F Smyrna\MovesTo{}Constantinople, A Smyrna\MovesTo{}Ankara, A
Smyrna\MovesTo{}Syria, A Smyrna\MovesTo{}Armenia}.

This yields Table~\ref{t3} (again, asterisks mark the new elements).
\fig{Agenda $t=3$}{\label{t3}\begin{tabular}{rlcr}
(*&\{{\em A Constantinople\MovesTo{}Bulgaria,F Ankara\MovesTo{}Black-Sea,A Smyrna\MovesTo{}Armenia}\}&=&1034,\\
{} *&\{{\em A Constantinople\MovesTo{}Bulgaria,F Ankara\MovesTo{}Black-Sea,A Smyrna H}\}&=&1014\\
{} *&\{{\em A Constantinople\MovesTo{}Bulgaria,F Ankara\MovesTo{}Black-Sea,A
                        Smyrna\MovesTo{}Syria}\}&=&1014,\\
&  \{{\em A Constantinople\MovesTo{}Bulgaria,F Ankara\MovesTo{}Constantinople}\}&=&1004,\\
{} *&\{{\em A Constantinople\MovesTo{}Bulgaria,F Ankara\MovesTo{}Black-Sea,A
                        Smyrna\MovesTo{}Constantinople}\}&=&1004,\\
{} *&\{{\em A Constantinople\MovesTo{}Bulgaria,F Ankara\MovesTo{}Black-Sea,A
                        Smyrna\MovesTo{}Ankara}\}&=&989,\\
&  \{{\em A Constantinople\MovesTo{}Bulgaria,A Smyrna\MovesTo{}Armenia}\}&=&979,\\
&  \{{\em A Constantinople\MovesTo{}Bulgaria,A Smyrna\MovesTo{}Ankara}\}&=&959,\\
&  \{{\em A Constantinople\MovesTo{}Bulgaria,A Smyrna S F Ankara}\}&=&959,\\
&  \{{\em A Constantinople\MovesTo{}Bulgaria,A Smyrna\MovesTo{}Syria}\}&=&959,\\
&  \{{\em A Constantinople\MovesTo{}Bulgaria,F Ankara H, A Smyrna H}\}&=&959,\\
&  \{{\em A Constantinople\MovesTo{}Bulgaria,F Ankara\MovesTo{}Armenia}\}&=&949,\\
&  \{{\em A Constantinople\MovesTo{}Bulgaria,F Smyrna\MovesTo{}Constantinople}\}&=&939,\\
&  \{{\em F Ankara\MovesTo{}Black-Sea}\}&=&870,\\
&  \{{\em A Smyrna\MovesTo{}Armenia}\}&=&845,\\
&  \{{\em F Ankara\MovesTo{}Armenia}\}&=&845,\\
&  \{{\em A Constantinople\MovesTo{}Smyrna}\}&=&815,\\
&  \{{\em F Ankara\MovesTo{}Constantinople}\}&=&815,\\
&  \{{\em A Constantinople\MovesTo{}Ankara}\}&=&815,\\
&  \{{\em F Ankara\MovesTo{}Constantinople}\}&=&815,\\
&  \{{\em A Smyrna\MovesTo{}Constantinople}\}&=&815,\\
&  \{{\em A Smyrna\MovesTo{}Ankara}\}&=&815,\\
&  \{{\em F Armenia S A Constantinople}\}&=&815,\\
&  \{{\em F Armenia S A Constantinople}\}&=&815,\\
&  \{{\em A Constantinople S F Ankara}\}&=&815,\\
&  \{{\em A Constantinople S A Smyrna}\}&=&815,\\
&  \{{\em A Smyrna S A Constantinople}\}&=&815,\\
&  \{{\em A Smyrna S F Ankara}\}&=&815,\\
&  \{{\em A Smyrna\MovesTo{}Syria}\}&=&815,\\
&  \{{\em A Constantinople H, F Ankara H, A Smyrna H}\}&=&815)
\end{tabular}}

So, finally we have several promising full candidate set of orders,
the best of which is {\em A
Bul\-gar\-ia\MovesTo{}Con\-stan\-tin\-o\-ple, F Ankara\MovesTo{}Black-Sea, A Smyrna\MovesTo{}Armenia}. This is in fact
one of the popular openings for Turkey.\footnote{``This opening poses
all sorts of problems for Russia. If she trustingly ordered {\em F
Sevastopol\MovesTo{}Rumania,}she is in real trouble. Even if she
ordered {\em F Sevastopol\MovesTo{}Black-Sea}, she is going to have
difficulty maintaining her position.'' \cite{PG}}
However, the numbers in Table \ref{figT}
were not rigged to make the orders come out right for this one case.
Our system should derive decent orders for any position.
Since this is at the top of the agenda,
it then looks at the enemy units that could have some effect on the
outcome: {\em F Sevastopol, A Moscow, F Trieste, A Budapest, F
Naples}. It tries possible the
orders for these enemy units. It evaluates the possible final positions,
and gets some points knocked off for the possible conflict in the
{\em Black-Sea} with the Russian fleet as well as the advances of Austria
and Italy. Since this is
the only set of orders thus far that have been tested against enemy
orders, it is considered the current best set of submittable orders.

If it has extra time for processing, then it would leave depth-first
mode and try the next item
on the agenda, namely \{{\em A
Con\-stan\-tin\-o\-ple\MovesTo{}Bul\-gar\-ia,F
Ankara\MovesTo{}Black-Sea,A Smyrna H}\}=1014. This is that
special element that got put on the agenda when \{{\em A Constantinople\MovesTo{}Bulgaria,F Ankara\MovesTo{}Black-Sea}\}=1014
was exploded into its successors. It is tested against enemy orders,
and it gets even more points knocked off for the possible Russian
advance. Since this is not better than the current best set of
submittable orders, it is discarded.

If the deadline has still not yet arrived, the diplomat would examine
the position \{{\em A Con\-stan\-tin\-ople\MovesTo{}Bul\-gar\-ia,F Ankara\MovesTo{}Black-Sea, F
Smyrna\MovesTo{}Syria}\}=1014 against enemy orders, and presumably discard it. Next, it
would come to \{{\em A Constantinople\MovesTo{}Bulgaria, F
Ankara\MovesTo{}Constantinople}\}, and this represents a jump to
a different part of the search tree. It would look at all the positions
generated by the orders for {\em A Smyrna} in this context.  And so on until it
ran out of time, at which point the best set of moves that have been
tested against enemy moves would already have been submitted.

\subsection{Advanced Models}

\subsubsection{Massive Support}
Consider the following problem. Let us assume there is
one space much more important than all of the others for Turkey at the
moment. Perhaps he is in an end-game position where he has captured
all of the Mediterranean, but now his navy is bottled up at Gibraltar.
The {\em Mid-Atlantic-Ocean} now becomes of critical importance to him.
Assume England is defending with a fleet doubly supported ({\em F
Mid-Atlantic-Ocean
H, F Irish-Sea S F MAid-Atlantic-Ocean, F Gasconoy S F
Mid-Atlantic-Ocean}), and
that Turkey has four fleets available to attack ({\em F North-Africa,
F Western-Mediterranean, F Spain(sc), F Portugal}).

The evaluation routine described above will give a low score to any
attack of the Mid-Atlantic Ocean consisting with two or less supports,
since such attacks would fail. Only the triply supported attack is
given a high score. Since the search routine only adjusts the move for
one unit at a time. Thus,  before finding the solution, the diplomat
must three times consider
orders increasingly low on the agenda.

In order to avoid this problem, we rewrite the agenda creation routine
so that whenever an order is placed on the agenda, all of its possible
combinations of support are placed there as well.

Obviously the same objections hold for long convoys, so all possible
convoys should be immediately placed on the agenda as well.

\subsubsection{Limited Raiding}

The system above is designed to avoid raiders at all costs. A position
containing such a unit which exerts its control over much valuable territory
is given an extremely low value. Similarly, by forcing one unit
behind enemy lines, our program thinks that it now controls the
majority of the enemies territories with that one unit.

However, this is not very accurate. A single raider presented with a
variety of supply centers can still only occupy one
each year. Thus, the control exerted by the raider must be diluted by
the number of supply centers that it controls.
Hence, this evaluation function gives a single enemy
unit access to seven supply centers if one could by so doing capture
two or three supply centers ourselves.

Also, we let a player's owned home
supply centers exert control, since units could be built there (but
probably the distance should be a little greater, since the units
don not exist yet and might never). Furthermore, any owned supply center
should give a bonus to the owning
player's position evaluation, regardless of the status of ``control.''
A further refinement is to have units exert partial control in
any adjacent territories, even those occupied by enemy units.

\subsubsection{Advancing Towards the Front} \label{expert}

Now comes the most problematic aspect of position evaluation. An
example will illustrate the problem best. Suppose Turkey has
taken over half the board and builds an {\em A Smyrna}.
`How do we make the position evaluation encourage the
movement of the army towards the ``front''?  The above notions of
``control'' would not make the army move towards the front. All of the
spaces in the vicinity of Smyrna are already 100\% under the control
of Turkey, so no move by the army could possibly increase the value of
the position which is based solely on control of territory.

However, this is not the right question to ask. Although most of the
intelligence should be put into the position evaluator, some
intelligence can be put
into the order generation and agenda sorting as well. For example, a
heuristic
like ``always prefer a support order in preference to a hold order'' can
more easily be put into the agenda sorting (or move generation)
routine than the position evaluation routine (which is supposed to be
``static'' and just look at the resulting board position, not the orders
themselves {\em per se}). So, we can add a heuristic to make it prefer
orders that move in the ``right direction'' over all other orders,
everything else being equal (i.e., the position evaluation being the
same).

One could adopt a gravitational
attraction analogy, where the unit would be seek to reduce the
distance between itself and uncontrolled (or partially controlled)
territories, weighting valuable territories proportionally more and
distant territories much less. The idea is to have the unit prefer to
move to where it can do the most good in the least time. For example,
this will insure that Turkey does not order {\em A Smyrna\MovesTo{}Syria}, because
that would move the army away from valuable partially controlled
territories; even if its position evaluation were the same as the
other orders, the best-first search would put {\em A Smyrna\MovesTo{}Syria}
``behind'' the
other orders.

This preference to have units near where they could do the most good
{\em could} instead be incorporated directly into the position evaluation,
but then it has to be set up carefully to always be a less important
factor than control.

\subsubsection{Meta-strategy}

Another way to emphasise the regions of the board which are the most
important---the front---is to view
the game at more abstract levels, with
provinces grouped together into meta-provinces, and perhaps meta-provinces
grouped together into still larger regions, and so on. (See
Table~\ref{MM})
\fig{Meta-Map}
{\label{MM} \rule{0cm}{6.5cm} \rule{9.5cm}{0cm}}
Thus, if the system
for some reason decided that it was important to go for Scandinavia,
then it could raise the ``altitude'' of that whole region as well the
territories leading up to it. A strategy of a diplomat might be to
dominate one portion of this hierarchy before moving on to the next.

However, how would we determine which meta-provinces to concentrate on
at any given moment? We should translate the current position to an
equivalent position on the meta-map. Then we can recursively invoke
our diplomat on this meta-position. In the meta-game, it is much easier
to find moves because the number of units and spaces is
correspondingly smaller. The meta-moves that we find could then be
used to determine the short-term goals of our program.

\subsubsection{Large Searches}

The above problem-reduction technique is useful for coming up with moves
involving many units. Given a constant time limitation, countries with
more units must do a more superficial search of their move tree due to the
resulting increase in the number of combinations possible.
Using a beam search to save time results in the consideration of an
agenda in which all of the set of orders being considered are
essentially the same. This
is good if we are trying to fine-tune a strategy, but not if we are
interested in examining fundamentally different tactics: for example,
the choice between a static defense and a ``the best defense is a
good offense'' approach.

If the fundamental strategy has already been solved on a meta-game
level, then the beam search described above is satisfactory for
determining the desired moves to carry out that strategy. However, we
can do better than that. If the meta-strategy indicates that we should
attempt to capture a certain two meta-provinces which are not adjacent
(for example, Turkey pushing  westward with his fleets while his armies
are busy fighting in the Moscow area), then the real set of moves on
each front can be determined independantly in most cases. This
technique of ``divide-and-conquer''  results in a real
improvement in speed for large countries fighting on multiple fronts.

Obviously, this is not the only solution. Another possibility is to
counter the increase in units with a corresponding increase in the
width of the search (to the obvious deteriment of the search depth).

Finally, another possibility is to add a random element to the search
path in order to prevent all the search paths from heading the same
way. For example, start the search with a variety of seeds (see
section \ref{ss}) chosen randomly perhaps. At each step in the search,
instead of chosing the
highest item on the agenda, chose agenda items on the basis of a
decaying probability distribution. The final decision of what set
orders to submit will be made on a basis of the value of the resulting
positions. However, the order in which we look at positions will
include a slight random factor.

\subsubsection{Predictability}

This also solves our next problem: ``Predictability.''

Until now, the behavior of our diplomat was completely
predictable. However, sometimes the best move is not well defined. For
example, suppose Italy with his {\em F Tyrrhenian Sea} is trying to
counter Turkey's
incursion into the {\em Ionian-Sea}. Italy must guess whether Turkey
is going to
attack {\em Naples} or {\em Tunis}. If Turkey is completely
predictable, then
Italy will be able to counter the advance. However, if Turkey
recognizes the situation and makes a random choice, then he has a 50\%
chance of gaining a supply center.

In theory, such {\em guessing games} should be recognized and
probabilities chosen according to well-established principles of game
theory. However, in practice, the best we hope to do is to add a small
random factor to the evaluation of each position in order to avoid
this problem.

\subsubsection{Province Values}

There are a number of refinements we can make to the mountain model
of control merely by adjusting the values of regions.
For example, territories with supply centers should be more valuable,
especially in the Fall when supply center ownership is established.
Similarly, territories that border many supply
centers should be more valuable in the Spring.

We must also give a value to certain {\em combinations} of spaces
being controlled. The existance of stalemate lines means that a
position is more the sum of its parts. If England is holding ({\em F Portugal,
F Mid-Atlantic-Ocean, F English-Channel, F Irish-Sea}) against an
alliance of southern powers that have
overrun the continent, he can still hold his ground against further
advances. \cite{PG} England can thus secure his place in a draw, or
divide the
alliance against him. Hence, England's position is much more powerful
than a simple numerical count of his supply centers might indicate.
Our program must be able to detect all known stalemate lines, and be
able to form them itself.

In the final analysis, machine learning techniques are
necessary to lift the
tactics of computer Diplomacy players up to a level competitive with
even
novice human players. For example, the base values of territories could be
guessed at by humans and then optimized by machine learning
techniques. The computer could play games with various base values and
experiment in an attempt to derive better values.

\section{Diplomacy}

In the example given in the above section, the search wound up with
an anti-Russian opening. However, Turkey's neighbors have had no say
in the matter. In a real Diplomacy game, Turkey tries to negotiate
with all his neighbors. If Russia sounds trustworthy and presented an
intelligent plan for attacking Austria, then the anti-Russian opening
might be abandoned in favor of an anti-Austria one. Similarly, if
Austria declares war on Turkey, Turkey would be unwise to
risk a two-front war by moving against Russia.

That is to say,
so far we have outlined a recipe with which to create a fairly
intelligent tactician; however, we have not yet mentioned {\em diplomacy}
which after all is
the name of the game.  But how can seven diplomats programmed by
different people carry out any sort of meaningful dialog? To solve
this problem, one of the authors has developed for the Diplomacy
Programming Project a program called the {\em Diplomacy Interface}.
Although written in LCS,\footnote{LCS is a variant of SML developped by
Bernard Berthomeiu and the {\em Outil et Logiciel pour le
Communication} group at the {\em Laboratoire d'Automatique et de
l'Analyse de Syst\`{e}me} in Toulouse. LCS differs from SML in that it
allows the management of parallel processing.} it is capable of
launching programs written in any language compatable with the UNIX
operating system. The standard input of each diplomacy is rerouted via
the use of an LCS instream to the LCS agent responsible for the
communication with that player-diplomat. The agent then waits for a
rendezvous with the main program. If the main program has any output
to send to the player, it spawns a new process which waits for a
rendezvous with that player's communication agent. This agent then
can put the output in the corresponding LCS outstream where it is now
available on a FIFO basis as input for the diplomat.

To simulate the play of humans against machines, the Diplomacy
Interface is capable of opening {\em X} or {\em suntools} windows for
specified countries. Input and output for these windows are handled in
a manner similar to that described above. Actually, it is impossible
for a diplomat to determine who is playing the other countries: a
human or a machine. Thus, no built in prejudices against humans can be
built into the diplomats, and humans must perform a sort of Turing test
if they wish to distinguish the machines from other humans.

All communications are handled by the Diplomacy Interface; however, this
interface can then in turn forward messages when desired.
The evaluation routines used to compute the strategy of each diplomat
can then take into account this communication. Most of the
above discussion of position evaluation still
applies, except that the holdings of an alliance should be considered
together in the evaluation of ``altitude controlled,'' plus some catches
need to be put in to prevent accidental stabs. The diplomat should
probably work out two sets of orders each turn---one in the case that
it remains an ally and one in the case that it stabs.\footnote{We are
ignoring the possibility of {\em queasling}: that is to say,
fulfilling your obligations in a minimal way or undermining the treaty
in minor ways in order to maximize your self interests without
necessarily entering war with your former ally. Presumably, a weak
ally will have no choice but to allow a certain amount of {\em
queasling} to go unpunished. But it is difficult to predict at what
point your ally will declare your moves to be a ``stab.'' Our simple
model assumes two choice: either complete loyalty, or total war on all
fronts.}

Similarly, the system should work out
two sets of the opponent's orders (one in which he stays allied, one
in which he stabs), and evaluate the outcomes using the techniques of
game theory. This allows the program to
work with an ally while protecting itself.

If the predicted
profits for stabbing exceed its estimated benefits for cooperating,
then it stabs. Often the table of payoffs  forms a version of the
{\em Prisoner's Dilemna}. In that case, we must determine if this
stabbing opportunity is likely to repeat itself. In a one-time
prisoner's dilemma situation, there is no reason not to stab. However,
if such situations are anticipated to  arise again many times among the
several players, then you should respond with {\em tit for
tat}.\footnote{Given its simplicity, tit for tat is a suprisingly
effective strategy for
repeated prisoner's dilemma against multiple opponents. Tit for tat
begins with cooperative behavior in the first instance. In ensuing
rounds, tit for tat adopts the strategy employed last round by the
opponent.} The exact
frequency and number of players necessary depends on the size of the
short-term gains to be won with the stab. In a game with the
following payoff matrix:
\begin{center}
\begin{tabular}{l||l|l}
        &Stab   &Cooperate\\ \hline \hline
Stab    &0$\backslash$0 & $-2\backslash 5$ \\ \hline
Cooperate& $-2 \backslash 5$ & 3$\backslash$3
\end{tabular}
\end{center}
corresponding to a specific prisoner's dilemma, the ``best'' strategy
in a certain sense is to be cooperative until the other player stabs,
and then after that to always stab. However, the last three
opportunities to stab should be taken regardless of what the opponent
does.\cite{PD}

To extend the strategy to the game Diplomacy, we will instruct our
program to consider its worse enemy to be the last player to have
stabbed it. However, stabs of allies will be anticipated when so doing
is profitable and there are unlikely to be similar future stab
possibilities.
 For example, if the game can
be won with a stab, then our diplomat will always do so. However, it
will not stab an otherwise trustworthy ally for a single supply center
especially if that supply center could be taken in a later stab if
necessary.

A complete description of the communication language developed for
the Diplomacy Interface is beyond the scope of this paper.  However,
below are a few examples to give the reader the feel for what is
possible to express with this language.  English translations are
given below each.

\small

FRM ITA (FRM AUS (DWR))\\
``Italy says Austria has declared war on him.''

FRM RUS (DMZ (BLA ARM))\\
``Russia proposes that the {\em Black Sea} and {\em Armenia}
be kept empty as a sort of Demilitarized Zone.''

SUB (AMY CON MTO BUL)\\
``I submit the move  {\em A Constantinople\MovesTo{}Bulgaria}.''

SND RUS (IFF ALY (VSS AUS) THN (YES (DMZ BLA ARM)) ELS (NOP (DMZ
BLA ARM)))\\
``If you, Russia, agree to ally with me against Austria then I accept
your proposal to demilitarize the {\em Black Sea} and {\em Armenia},
otherwise I reject it.''

\normalsize

A diplomat gains some ideas about alliances via the negotiations;
then it gets the results of one turn. How can it determine if the two
are at odds? This is done in several steps.

First, all of the promises
made and received are stored in a truth maintenance system capable of
making simple deductions. For example, if Russia had sailed to the
{\em Black-Sea} after having agreed to (DMZ (BLA)), this system would
indicate that as a violation of their
specific agreement to keep the {\em Black-Sea} demilitarized. The program
would then nullify all agreement between the players (unless there was
another independant reason for keeping them), and anticipate hostile
actions in the future. Similarly, any conflict between Italian and
Turkish units would contradict a general promise of peace such as
(FROM TUR (PCE)).

Second, whether or not a player made any promises, it is a good idea
to reward him for any help he may have given, and to punish him for
any attacks.\footnote{In fact, an attack without warning should be
dealt with more harshly than an attack with adequate warning.}
Sometimes, a country may help with first promising to do so. Usually
this is due to a fear of ``leaks,'' but it could also be due to an
inability by that player to generate understandable diplomatic
messages. In either case, any country whose orders truly increased or
decreased the value to Turkey of the resulting position should be
raised or lowered on the ally/enemy scale in accordance.

Finally, the moves for each country should be used as a seed for a
short search of last turn's moves. If given the assumed combination of
alliances, we find a much better set of moves that the player could
have submitted but did not, then we must reject our assumptions and
continue searching the possible combinations of alliances until we
find out for which the moves actually made are the best moves possible
(or fairly good in any case). For example, a diplomat might assume
that Austria and Italy are at war due to a statement by Italy, but
then their moves might not make sense given this
assumption.  After some search, the diplomat could potentially
discover that the only reasonable interpretation of their moves is
that Austria and Italy are allied against Turkey and Russia.

\section{Conclusions}

In summary, an efficient search coupled with a good position
evaluation function allows a computer diplomat to devise decent
tactics. While for humans, diplomacy is often more important than
tactics, the first computer diplomats we create will better off with good
tactics before good diplomacy, because it is much harder for programs
(written perhaps by different people) to communicate with each
other.\footnote{Not to mention the problems of cooperating at strategic and
tactical levels, not being too gullible, knowing when to stab, etc.}
Despite having devised a protocol and written an interface for
communication between diplomats, much work remains to be done.

However, by attempting to solve these difficult problems resulting
from the interactions of intelligent agents, we hope to better understand
what sort of capabilities such an agent should have, when it cannot assume
that everyone in the world is its friend.  Creating programs that
can play complex multi-player games is a step towards creating programs
that can deal with the complexities of the real world and the craftiness
of humans.

\begin{thebibliography}{9}
\bibitem{Rules} {\sc A. Calhammer}, Rules to Diplomacy, The Avalon Hill Game
Company, 1956, 1976.
``Diplomacy'' is a trademark of the Avalon Hill Game Company.
\bibitem{Map} The Avalon Hill Game Company, Diplomacy Conference Map, 1976.
\bibitem{WinningWays} {\sc E. R. Berlekamp, J. H. Conway and R. K.
Guy}, ``Winning Ways for your Mathematical Plays,'' Academic Press, 1982.
\bibitem{AI1}  {\sc Charnkiak,
Riesbeck, McDermott and Meehan}, ``Artificial Intelligence Programming,''
(2nd edition 1987).
\bibitem{AI2} {\sc  Charnkiak
and McDermott}, ``Introduction to Artificial Intelligence,'' (1985).
\bibitem{FSM} {\sc N. V. Findler, G. L. Sicherman, and B. McCall}, ``A
multi-strategy gaming environment,'' {\em Computer Game Playing:
Theory and Practice}, {\sc M. A. Bramer} editor, Ellis Horwood
Limited, Chichester, England, 1983.
\bibitem{AI3}  {\sc  Forsyth and Rada}, ``Machine Learning: Applications in
Expert Systems and Information  Retrieval,'' (1986).
\bibitem{PLS} {\sc Feng-Hsiung Hsu, Thomas Anatharaman, Murray
Campbell, Andreas Nowatzyk}, {\em Un Ordinateur parmi les grands
ma\^\i{}tres d'\'{e}checs}, Pour la Science, {\bf 156}, October 1990,
pp. 88--97.
\bibitem{Green} {\sc T. Green}, private communication.
\bibitem{Tok} {\sc Eric S. Klien} ed., ``Tokugawa,'' Electronic Protocol,
{\sc R.~L.~Cochran Jr. and C.~B.~McKee} game masters, 1989--90.
\bibitem{PD} {\sc D. Loeb}, Prisoner's Dilemna, LAAS Internal Report.
\bibitem{AI4} {\sc Elaine Rich}, ``Artificial Intelligence,'' (1983/1989).
\bibitem{Chess} {\sc Warner Smith}, Technology Review, July 1990, p. 45.
\bibitem{Tz} {\sc C. Tzeng}, ``A Theory of Heuristic Information in
Game-Tree Search,'' Springer-Verlag, New York, 1988.
\bibitem{PG} {\sc Rod Walker}, ``Player's Guide to Diplomacy,'' The
Avalon Hill Game Company, Baltimore, 1979.
\end{thebibliography}
\end{document}