Neural Decipherment of Lost Languages
Python, PyTorch, Min-Cost Flow, Seq2seq Model, CNN, BiLSTM
Built a system to decipher archaic languages using symbols of known language. The proposed model is able to identify 81% of cognates correctly between Ugaritic (lost) and Hebrew (known) languages, generating results with a 15.13% improvement. Devised Probabilistic Weight Initialization, stacked BiLSTM seq2seq architecture, Residual Connections, and minimum-cost flow algorithm to get improved results on 4 different language pairs.