Immigration officials have been told to vet refugeesâ social media posts using Google’s online translator. Language experts caution even students against using the service.
Itâs a common internet experience: throw a foreign phrase into Google Translate or any other online translation tool and out comes a farcical approximation of the realÂ thing.
Thatâs why many experts â even Google itself â caution against relying on the popular Google Translate for complex tasks. GoogleÂ advisesÂ users that its machine translation service is not âintended to replace humanÂ translators.â
Yet the U.S. government has decided that Google Translate and other machine translation tools are appropriate for one task: helping to decide whether refugees should be allowed into the UnitedÂ States.
An internalÂ manualÂ produced by U.S. Citizenship and Immigration Services, the federal agency charged with admitting immigrants, instructs officers who sift through non-English social media posts of refugees that âthe most efficient approach to translate foreign language contents is to utilize one of the many free online language translation services provided by Google, Yahoo, Bing, and other search engines.â The manual includes step-by-step instructions for GoogleÂ Translate.
The manual was obtained by the International Refugee Assistance Project through a public records request and shared withÂ ProPublica.
Language experts said the governmentâs reliance on automatic translation to dig into refugee social media posts was troubling and likely to be error-filled since the services are not designed to parse nuance or recognize slang. The government may misconstrue harmless comments or miss an actually threateningÂ one.
âItâs naive on the part of government officials to do that,â said Douglas Hofstadter, a professor of cognitive science and comparative literature at Indiana University at Bloomington, who has studied language and analogies. âI find it deeply disheartening and stupid and shortsighted,Â personally.â
Asked about the agencyâs use of machine translation tools, USCIS spokeswoman Jessica Collins said in an emailed statement that review of publicly available social media information âis a common sense measure to strengthen our vettingÂ procedures.â
USCIS hasÂ statedÂ that âinformation collected from social media, by itself, will not be a basis to deny refugeeÂ resettlement.â
In 2017, FacebookÂ apologizedÂ after its machine-translation service translated a post by a Palestinian man that said âgood morningâ as âhurt themâ in English or âattack themâ inÂ Hebrew.
As a test, ProPublica asked language professors to copy and paste tweets written in casual language into Google Translate and compare the results with how they would interpret theÂ tweets.
One recent Urdu-languageÂ postÂ on Twitter included a sentence that Mustafa Menai, who teaches Urdu at the University of Pennsylvania, translated as âI have been spanked a lot and have also gathered a lot of love (from myÂ parents).â
Google translated the sentence as âThe beating is too big and the love is tooÂ windy.â
The Trump administration has vastly expanded the role of social media in deciding whether people can move or travel to the United States. Refugee advocates say the governmentâs reliance on machine-translation tools raises further concerns about how immigration officers make important decisions affecting applicantsâ lives and U.S. nationalÂ security.
USCIS has itself found that automated translation falls short in understanding social media posts. An undated draft internal review of a USCIS pilot social media vetting program concluded that âautomatic foreign language translation was notÂ sufficient.â
A separate pilot review conducted in June 2016 stated that ânative Arabic language and subject matter expertise in regional culture, religion, and terrorism was needed to fully vetâ two cases in which potentially derogatory social media information was found. TheÂ documentsÂ were published by theÂ Daily BeastÂ in JanuaryÂ 2018.
The manual, much of which is redacted, only addresses procedures for a narrow subset of refugees: people whose spouses or parents have already been granted refugee status in the U.S., or so-called follow-to-join cases. In 2017, 1,679 follow-to-join refugees were admitted to the U.S., about 3% of total refugee admissions, according to governmentÂ data.
âIt defies logic that we would use unreliable tools to decide whether refugees can reunite with their families,â said Betsy Fisher, strategy director at IRAP. âWe wouldnât use Google Translate for our homework, but we are using it to keep refugee familiesÂ separated.â
In a federal lawsuit in Washington state that is now in the discovery phase, IRAP is challenging the Trump administrationâs suspension of the follow-to-join refugeeÂ program.
It is unclear how widely the manualâs procedures are used throughout USCIS, or if its procedures are identical to those used for vetting all refugees or other types ofÂ immigrants.
The manual is undated, but it was released to IRAP in response to a request for records created on or after Oct. 23,Â 2017.
USCIS did not respond to questions on whether the manualâs procedures are used to vet other refugees, when it was put into use or if it is still inÂ use.
âThe mission of USCIS first and foremost is to safeguard our homeland and the people in it,â Collins said. âOur first line of defense in these efforts is thorough, systematicÂ vetting.â
In the 2018 fiscal year, USCIS conducted 11,740 social media screenings, according to an agencyÂ presentation.
The USCIS manual acknowledges that âoccasionally,â online translation services may not be adequate for understanding âforeign text written in a dialect or colloquial usage,â but it leaves it up to individual officers to decide whether to request expert translationÂ services.
Without foreign language fluency, an officer is unlikely to know whether a post needs additional review, said Rachel Levinson-Waldman, senior counsel at the Brennan Center forÂ Justice.
Google and Verizon, which owns Yahoo, did not respond to questions about the use of their services when vetting refugees. Emily Chounlamany, a Microsoft spokeswoman, said âthe company has nothing to share on theÂ matter.â
Language experts say satire is another problematic area. A recent satirical Persian-languageÂ tweetÂ showed a picture of Iranian elites raising their hands, with text stating, âWhose child lives in America?â (The tweet is commentary on a recentÂ controversyÂ in Iran regarding high-ranking officialsâ close relatives living in the West.) The text was translated by Google as âWhen will you taste America?â Microsoftâs result was: âWho is theÂ American?â
âThe thing about Persian and the Iranian culture is that people love to make jokes about anything,â said Sheida Dayani, who teaches Persian at Harvard University and instructs her students to avoid using Google Translate or similar tools for their assignments. âHow are you going to translate it via GoogleÂ Translate?â
Automated translation services are the âabsolute wrong technologyâ for immigration officers making important decisions, DayaniÂ said.
The use of translation tools has come up in other contexts. After a highway patrol trooper in Kansas conducted a warrantless search of a Mexican manâs car in 2017 by asking the man for consent to do so in Spanish via Google Translate, a U.S. district judgeÂ threwÂ out the search evidence, finding that the defendant did not fully understand the officerâs commands andÂ questions.
Google has touted improvements in its translation tool in recent years, most notably its use of âneural machine translation,â which it has gradually rolled out for more languages. Researchers in the Netherlands haveÂ foundÂ that while the neural machine translation method improves quality, it still struggles to accurately translateÂ idioms.
One major problem with machine translation is that such tools do not understand text in the same way that a person would, Hofstadter said. Rather, they are engaged in âdecodingâ or âtext substitution,â heÂ said.
âWhen it involves anything that is subtle, you can never rely on it because you can never know if itâs going to make grotesque errors,â HofstadterÂ said.
Machine-translation services are typically trained by using texts that have already been translated, which tend to use more formal speech, for instance official United Nations documents, said David Guy Brizan, a professor at the University of San Francisco who researches natural language processing and machineÂ learning.
Language iterates too quickly, especially among young people, for even sophisticated machine-translation services to keep up, Brizan said. He pointed to examples of English-language phrases currently popular on social media such as âlow-keyâ or âbeing canceledâ as ones that automated services could struggle toÂ convey.
He added that nontextual context like videos and pictures, the parties involved in a conversation and their relationship, and cultural references would be completely lost on machineÂ translation.
âIt requires a cultural literacy across languages, across generations, that is sort of impossible to keep up with,â he said. âYou can think of these translation programs acting as your parents orÂ grandparents.â
Rachel Thomas, director of the Center for Applied Data Ethics at the University of San Francisco, said that while machine-translation capabilities are improving, anyone depending on algorithms or computers should think carefully about the recourse for people wronged by those systemsâÂ mistakes.
Refugees rejected for admission can request a decision review, but advocates say they are typically given little detail as to why they wereÂ rejected.
Efforts to scrutinize social media posts of some people trying to enter the United States began under the Obama administration, and they were encouraged byÂ DemocratsÂ and Republicans in Congress. USCIS launched a social media division within its Fraud Detection and National Security division in July 2016, building on pilot programs operating sinceÂ 2015.
The Trump administration has dramatically increased social media collection as part of a push for âextreme vettingâ of people entering the country. In May, the State Department updated its visa forms to request social media identifiers from most U.S. visa applicantsÂ worldwide.
In September, the Department of Homeland Security published aÂ noticeÂ stating it intended to request social media information from a broad swath of applicants, including people seeking U.S. citizenship or permanent residence, refugees andÂ asylees.
Jeff Kao contributed to thisÂ story.