2022-01-20

Marathon Match 132 BouncingBalls

競技プログラミング

問題

www.topcoder.com

方針

銃の位置を焼きなましで決める
銃の位置決定後はビームサーチで良い盤面を探す
ボールが真ん中に集まるように盤面を作成する

銃の位置を焼きなましで決める (START ~ 1 sec)

まず最初に銃の位置を固定するために焼きなましを行って良さそうな銃の配置を決めます。

遷移としては以下の 1つだけです。

ランダムに銃の位置を変化させる

銃の位置が決まったあとは、シミュレーションを行い貪欲にパネルを配置していってスコアを求めます。

貪欲に配置

パネル配置については以下のルールのもと行われます。

新しい空白マスに到達した際に

ボールが上方向に進んでいる場合
- ボールが盤面の左側に存在する場合は / のパネルを設置
- ボールが盤面の右側に存在する場合は \ のパネルを設置
- それ以外はランダムにパネルを配置

f:id:simanman:20220120061550p:plain:w240

f:id:simanman:20220120061621p:plain:w240

ボールが右方向に進んでいる場合
- ボールが盤面の上側に存在する場合は \ のパネルを配置
- ボールが盤面の下側に存在する場合は / のパネルを配置
- それ以外はランダムにパネルを配置

f:id:simanman:20220120062111p:plain:w240

f:id:simanman:20220120062328p:plain:w240

下方向と左方向については上方向と右方向を逆にしたものになります。

貪欲にパネルを置いていってスコアが高くなる配置を採用します。

ビームサーチで良い盤面を探す (1 sec ~ TIME_LIMIT)

銃の位置を固定したらビームサーチで盤面を探索していきます。

進め方としては以下の方法で行っていきます

盤面のシミュレーションを行う
いずれかのボールがまだパネルを配置していないマスに止まるとシミュレーションを止める
パネルを配置
シミュレーションを再開

パネルの配置

パネルの候補として /, \, . のいずれか 1つを選択します。配置候補が複数ある場合はすべての組み合わせを試しますが、空白パネルを 2つ以上を含む候補は除きます。

例) 配置候補が 2箇所ある場合のパネル候補一覧

/, /
/, \
/, .
\, /
\, \
\, .
., /
., \

盤面の評価

1つでもボールが盤面から出てしまうとゲームが終了するので、盤面から出ないように中央にボールが集まっている状態を良しとします。また、各パネルの配置についてボーナス点を与えていきます。いくつか例をあげます

盤面中央に向きを変える配置

f:id:simanman:20220120064446p:plain:w240

盤面中央へと進む配置

f:id:simanman:20220120064807p:plain:w240

また特定の配置にはペナルティを与えています

盤面外に向きを変える配置

f:id:simanman:20220120065117p:plain:w240

盤面外へと進む配置

f:id:simanman:20220120065225p:plain:w240

他にもボールの位置についてもペナルティやボーナスを与えています

f:id:simanman:20220120065532p:plain:w240

赤い部分はペナルティ、緑の部分はボーナスになっています。

他にも色々ありますが、基本的には「盤面中央に向かう変更ならボーナス、盤面外へと進む変更ならペナルティ」といった感じです。

感想

評価関数をうまく作れた人が勝てそうだと思ったので、ビジュアライザをずっと眺めてましたが最後までこれとこれといったアイデアは出てきませんでした。事前に計算した盤面を埋め込む方針も考えましたが、固定パネルとボーナスパネルがかなりノイズになっててそれが出来ない仕組みになっていた気がします。

Seed 1: 198
Seed 2: 42985
Seed 3: 28195
Seed 4: 2748
Seed 5: 285
Seed 6: 1866
Seed 7: 4445
Seed 8: 28274
Seed 9: 46363
Seed 10: 35872

2021-12-24

MM131 StopTheElves 感想

競技プログラミング

問題

www.topcoder.com

最近オンライン問題多くなってきた。

方針

プレゼントを囲むように箱を設置する
箱はすぐ設置するのではなくエルフが近づいたときに設置を行う
事前にシミュレーションを行って防衛成功率の低いプレゼントは守らない

プレゼントを囲むように箱を設置

プレゼントの周辺マスをチェックして、このマスに対して凸包を求めます。この操作は毎ターン行います。

f:id:simanman:20211223235843p:plain:w240

f:id:simanman:20211224000523p:plain:w240

f:id:simanman:20211224000842p:plain:w240

求めたマスに対して、それぞれ最短路を求めてマス同士を接続します。

f:id:simanman:20211224001108p:plain:w240

箱はすぐ設置するのではなくエルフが近づいたときに設置を行う

箱を早く設置しようがしまいが結局エルフは箱 or プレゼントを持ち帰るまでフィールドに居続けるので、なるべく右往左往してもらうために箱の設置はギリギリまで待ちます。あと、お金は基本的に保持していたほうがあらゆる事象に対処出来るのでなるべくセーブします。

f:id:simanman:20211224002300p:plain:w240

f:id:simanman:20211224002316p:plain:w240

事前にシミュレーションを行って防衛成功率の低いプレゼントは守らない

実際にクエリを送る前にシミュレーションをある程度回してみて、防衛に成功する確率の低いプレゼントについては守る対象から外します。

感想

エルフ行動の優先順位をもっとうまく利用した守り方もありそうだったけど、そこまで考察 & 実装が出来なかった。ただプレゼントを守るだけではダメなのはスコア差から読み取れたけど、そこから何も進めれなかったので残念。凸包以上に効率的な守り方を見つけられなかったのもダメだった。（木が多いときはもっと効率的な守り方がありそう）

2021-11-19

TCO21Final 参加記

競技プログラミング

TCO21Final に参加してきました。

問題

下の画像のようなトーラス状の N * N フィールドが存在しており、コインを 1つ取ると 100 point、棘を踏むとスコアが半分になります。コインは取得すると消えますが、棘は踏んでも消えません。

f:id:simanman:20211119225829p:plain:w400

プレイヤーは自由に移動できるというわけではなく移動用のサイコロを複数保持しています。(2 ~ 6 個）サイコロの種類は全部で 12種類存在しています。

0 が出るサイコロ (STAY)
1 が出るサイコロ (ONE)
2 が出るサイコロ (TWO)
3 が出るサイコロ (THREE)
4 が出るサイコロ (FOUR)
5 が出るサイコロ (FIVE)
6 が出るサイコロ (SIX)
1 or 3 or 5 が出るサイコロ (ODD)
2 or 4 or 6 が出るサイコロ (EVEN)
1 or 2 or 3 が出るサイコロ (LOW)
4 or 5 or 6 が出るサイコロ (HIGH)
1 or 2 or 3 or 4 or 5 or 6 が出るサイコロ (RANDOM)

1 度サイコロを使用すると、そのサイコロは消費され新たなサイコロを補充します。このとき補充されるサイコロの確率は各サイコロの出目の種類数に比例します。（なので、例えば RANDOM は ONE のサイコロよりも 6倍出やすく、ODD よりも 2倍出やすいといった感じです）

N * N ターンの間でより多くのスコアを取得するように頑張りましょうという問題です。（N * N ターンの経過を待たずとも途中で辞めることも可能です）

方針

最初は効率的にコインを集める方法で最初考えていたのですが、棘が密集している画像を見て「いかに棘を踏まないようにサイコロを振るか」に方針を変えました。

f:id:simanman:20211119232321p:plain:w400

基本方針は貪欲で現時点で保持しているサイコロを振ってみて止まる可能性のあるマスをすべて試して、その後の盤面の評価値の合計が高いものを採用していきます。

評価値としては主に以下のような内容になってます（一部）

コインに止まる [+100]
棘に止まる [-score / 2]
RANDOM サイコロを消費する [+1000]
RANDOM サイコロを安全に消費できるマスに移動する [+100]
棘以外のマスで STAY サイコロを振る [+1000000]

ある程度ゲームを回してみたところ、手持ちが RANDOM サイコロだけになり棘を踏むリスクが高まってゲームを中止するといった流れが多く見られたので、とにかく RANDOM を捨てることに比重を置きました。STAY サイコロは実質手札交換みたいなものなので振れるタイミングでは必ず振るようにしています。

あとは、実際にジャッジ側に投げる前にどの程度のスコアが取れるのかを事前にシミュレーションをしておき、そのシミュレーションの平均スコアを基準にゲームをやめるかどうかの判断を行いました。

やめる基準としては 事前計算した平均スコア * 0.8 のスコアを保持している状態で棘を踏む可能性が発生した際に、ローカルでシミューレーションを複数回してそのスコアの平均値が今よりも大きいのであればゲームを続行、そうでないなら終了という判断をしています。

感想

実装した内容は 2手読みな感じになったが、個人的にはモンテカルロ木探索とか刺さりそうな感じはした。（ただ眠くて頭が回らず実装できなかった） 3手読みは時間制限的に無理で、いい感じに DP で書けるかな？と思ったけどあまりいい感じにスコアは伸びてくれなかった。

関係ない話

ノート PC 1台で参加したけど、画面共有とカメラ ON でかなりのパワーを取られて少しモッサリした環境で参加する必要があったのが辛み。

事前に AWS マシンの提供があったけど結局使ってない、割とハイスペックだったからテスト回すときとか使っとけばよかった感はある。

途中眠すぎて寝てた。

トイレ行きたいときに毎回チャットで連絡して返信を待つのが面倒だった。

途中 AHC006 が開催れていることに気づいて順位表だけ眺めてた。

たまに英語でなにか言われたけど、何を言ってるかわかってなくてとりあえず「OK!!」とか言ってた。間違った返事をしている可能性がある。（大体はテキストでやり取りしてたので多分大丈夫）

色々新鮮で楽しかった。

2021-11-03

Marathon Match 130 GraphLabeling Strategy

Problem

www.topcoder.com

Note

This article has been translated by DeepL. Original ver is here.

https://simanman.hatenablog.com/entry/2021/11/03/121958

Strategy

The general strategy is to determine the maximum value for labeling before, and then gradually reduce the maximum value if the entire node was successfully labeled, in a binary search-like strategy.

Pre-determine the maximum value of label X to be given before labeling.
Start labeling from a random point and visit each node recursively
When labeling at each vertex, check if nodes with an even distance from the starting point can be labeled from 0 (0, 1, 2, ..., X). Nodes with odd distances are checked to see if they can be labeled from X (X, X - 1, X - 2, ..., 0).
If labeling succeeds for all nodes, update the maximum X value (X = X / 2); if it fails, set a value slightly lower than the previous best value (X = BEST_X * random_value [0.95, 1.0))
Repeat this procedure until the time limit is reached.

Determine the maximum value X of the label

Determine the maximum value when labeling is done. This way you will get a graph with a definite X score when the labeling is complete.

Labeling

Start from a random point and label it. Nodes with an even distance from the starting point will search for labels assignable from 0, and nodes with an odd distance will search for labels assignable from X.

Example for X = 100

f:id:simanman:20211103120144p:plain:w240

f:id:simanman:20211103120719p:plain:w240

f:id:simanman:20211103120707p:plain:w240

f:id:simanman:20211103120826p:plain:w240

f:id:simanman:20211103120925p:plain:w240

f:id:simanman:20211103121016p:plain:w240

Since the labeling was successful for X = 100, we will do the same for the next attempt with the value of X = 50. If it fails, we randomly choose a value from the range X = [95, 99]. If it succeeds, it will lower the maximum value by a large amount, and if it fails, it will lower it by a small amount.

Repeat this process until the time limit and output the best labeling.

Summary

My overall score was low because I didn't know about the Gorom ruler, but I was able to get a high score on a sparse graph, so I put together a strategy.

2021-11-03

Marathon Match 130 GraphLabeling 戦略

競技プログラミング

問題

www.topcoder.com

戦略

大まかな戦略としてはラベル付けを行う際の最大値を事前に決定して、ノード全体にラベル付けが成功した場合は最大値を徐々に減らしていくという二分探索風な戦略を取りました。

ラベル付けを行う前に与えるラベルの最大値 X を事前に決定する
任意の点からラベル付けを開始し、各ノードへ再帰的に訪問する
各頂点においてラベル付けを行う際に、開始地点からの距離が偶数のノードは 0 からラベル付けが行えるかどうかをチェックする (0, 1, 2 ... X)。距離が奇数のノードは X からラベル付けが行えるかどうかをチェックする (X, X - 1, X - 2, ... 0)。
すべてのノードに対してラベル付けが成功した場合は最大値 X の値を更新する（X = X / 2)、失敗した場合は過去のベストの値から少し下げた値を設定する（X = BEST_X * random_value [0.95, 1.0))
この手順を制限時間まで繰り返す

ラベルの最大値 X を決定する

ラベル付けを行う際に最大値を決定します。こうすることでラベル付けが完了した際に X スコアが確定したグラフを手に入れることが出来ます。

ラベル付けを行う

任意の点から開始してラベル付けを行います。開始地点から距離が偶数のノードは 0 から割り振り可能なラベルを探索、距離が奇数のノードは X から割り振り可能なラベルの探索を行います。

X = 100 の例

f:id:simanman:20211103120144p:plain:w240

f:id:simanman:20211103120719p:plain:w240

f:id:simanman:20211103120707p:plain:w240

f:id:simanman:20211103120826p:plain:w240

f:id:simanman:20211103120925p:plain:w240

f:id:simanman:20211103121016p:plain:w240

X = 100 ではラベル付けが成功したので、次の試行では X = 50 の値で同じ処理を行います。もし失敗した場合は X = [95, 99] の範囲からランダムに選びます。成功したときは大きく最大値を下げて、失敗した場合は小さく下げていく感じになります。

この処理を制限時間まで繰り返して一番良かったラベル付けを出力します。

まとめ

ゴロム定規を知らなかったので全体的なスコアは低かったのですが疎なグラフでは高いスコアを出せていたので方針をまとめてみました。

2021-09-24

Marathon Match 129 ChickenRun (translated ver)

Prolog

This article is translated by DeepL and fix a little.

simanman.hatenablog.com

Problem

www.topcoder.com

Strategy

Organize the farmers into groups of two. If there is an odd number, only one group will have three members.
Move the farmer.
Evaluate the grid after the move to determine the best farmers placement.

Farmer groupings

We will need at least two people to catch the chickens, so build a group of two.（There are some patterns that can be caught by one person, but that's not many.）

f:id:simanman:20210923220118p:plain:w480

f:id:simanman:20210923220132p:plain:w480

If there is an odd number, there will be only one group of three. Each group decides on a chicken to chase, and then tries to chase.

Move the farmers

As for moving farmers, there are five possible move candidates for each farmer: up, down, right, left, and wait. If you try to combine the moves of each farmer, you will get a pattern of 5 ** number of farmers. In order to find the optimal placement from this vast of patterns, I used the direction of movement of the farmers as an array and SA (Simulated Annealing) the direction of movement to find the optimal placement.

f:id:simanman:20210924055628p:plain:w480

The transition is select any one point and change direction.

f:id:simanman:20210924055532p:plain:w480

This will make it easier to search for some effective patterns even if the number of farmers increases.

Evaluation of the grid

For the evaluation of the board, we focused on the following items for confirmation.

Catching the chickens
Check to see if the chickens are in a caught pattern.
Whether you are able to get close to the chicken to be captured or not.
Whether or not the candidates of movement of the chickens to be captured is being reduced.

The higher the likelihood of catching chickens, the higher the evaluation value.

Catching the chickens

If farmer reaches the same cell as the chicken that is the highest value. In some cases, it may be better not to dare to catch them, but I couldn't evaluate them that much.

A case where it's better not to catch them.

f:id:simanman:20210923222127p:plain:w480

The best solution here will be to try to catch two chickens at the same time.

f:id:simanman:20210923222231p:plain:w480

Patterns that can be caught

Condition of a chicken can be caught is the chicken cannot move and there is a farmer in the adjacent cell to the chicken. For each chicken, check if it is in this state, and if it matches the pattern, give it a higher value. The main patterns are as follows.

f:id:simanman:20210923223028p:plain:w600

Leading to a catching pattern

For the patterns that can be caught, the prior state is only one candidate for chicken movement and a fence or farmer exists at the destination, so this state is also evaluated.

f:id:simanman:20210923224255p:plain:w600

Patterns that can be caught with probability

Since both the order and direction of the chickens' actions are randomly determined, the order and direction of the chickens' movements may lead to a catching pattern, also evaluate those patterns.

A pattern that can be caught depending on the order in which the chickens move.

f:id:simanman:20210923231041p:plain:w600

A pattern that can be caught depending on where the chickens are moved.

f:id:simanman:20210923231350p:plain:w600

Approach the target chicken.

Approach the target chicken. When approaching, move in such a way that you can reach as many cells as possible before the target chicken.（Move the chicken so that its Voronoi area becomes smaller.）

Summary

I made a mistake in the design of the evaluate function because the content of the evaluation function to catch the chickens became complicated and produced some unintended behavior. As for the movement of farmers, I think it would have been more efficient to focus on a specific area rather than the whole area at once, but I didn't have time to implement it, so I gave up. I was a little put off by the randomness of the chicken movement, but I wish I had done a little more detailed research and incorporated it into my evaluate function.

2021-09-24

Marathon Match 129 ChickenRun

競技プログラミング

問題

www.topcoder.com

方針

農家を 2人 1グループにまとめる。奇数の場合は 1つだけ3人のグループになる。
農家を移動させる
移動後の盤面を評価して一番良い配置を決定する

農家のグループ分け

鶏の確保には最低でも 2人必要なので 2人組のグループを作成します。（1人でも確保出来るパターンは存在しますがそんなに多くありません）

f:id:simanman:20210923220118p:plain:w480

f:id:simanman:20210923220132p:plain:w480

奇数の場合は 1つだけ 3人のグループが出来ます。各グループごとに追いかける鶏を決定し、その鶏を追いかけるようにします。

農家の移動

農家の移動ですが、各農家の移動先の候補は 上, 下, 右, 左, 待機 の 5つの候補が存在しており、各農家の移動の組み合わせを試そうとした場合 5 ** 農家の人数 のパターンが発生します。この膨大なパターンの中から最適な配置を見つけるために農家の移動方向を配列としてその移動方向を焼き鈍すことで最適な配置を求めました。

f:id:simanman:20210923213836p:plain:w480

遷移は任意の1箇所を選んで方向を変更するです

f:id:simanman:20210923213940p:plain:w480

これで農家が増えてもある程度有効なパターンを探索しやすくなります。

盤面の評価

盤面の評価は以下の項目を中心に確認を行いました。

鶏の確保
鶏が確保できるパターンかどうかの確認
捕獲対象の鶏に接近できているかどうか
捕獲対象の鶏の移動可能範囲を狭めているかどうか

鶏の確保に繋がる可能性が高いほどより高い評価になるようにしています。

鶏の確保

鶏と同じマスに到達した場合はスコアが増えるので評価としては一番高いです。場合によってはあえて捕まえないほうが良くなる場合もありますが、そこまで評価出来ませんでした。

捕まえないほうが良くなるケース

f:id:simanman:20210923222127p:plain:w480

二羽同時に捕まえられるようにするのがここでは最適解となります。

f:id:simanman:20210923222231p:plain:w480

確保できるパターン

鶏が確保できる状態というのは 鶏が移動不可 + 鶏の隣接マスに農家が存在する ことを指します。各鶏についてこの状態になっているかどうかを調べ、そのパターンに一致した場合は評価を高くします。主なパターンとしては以下のようなものがあります。

f:id:simanman:20210923223028p:plain:w600

確定パターンへの誘導

確保出来るパターンについては事前の状態として 鶏の移動候補が 1つ + 移動先に柵または農家が存在している という状態になっているので、この状態についても評価を行います。

f:id:simanman:20210923224255p:plain:w600

確率で確保出来るパターン

鶏の行動は行動する順番も方向もランダムに決定されるので、鶏の移動順や移動方向によっては確定パターンにつながる可能性があります。それらのパターンについても評価を行います。

鶏の移動順によっては確保が可能になるパターン。

f:id:simanman:20210923231041p:plain:w600

鶏の移動先によっては確保が可能になるパターン。

f:id:simanman:20210923231350p:plain:w600

対象の鶏に接近する

対象の鶏に対して接近します。接近する際には対象の鶏より先に到達出来るマスの数を多くするように移動します。（鶏のボロノイ領域が小さくなるように移動する）

まとめ

鶏を確保するための評価関数の中身が複雑になって意図しない挙動も出てしまったので評価関数の設計で失敗してしまった感じ。農家の移動に関しては全体をまとめてやるより特定の領域とかで絞ったほうがもっと効率的に配置を調べられた気もするけど実装する時間がなかったので諦めた。鶏の移動についてランダム性が高いのでちょっと後回しにしてたけど、もうちょっと細かく調査して評価に組み込めば良かったなと思う。