Frequently Asked Questions
I will add more technical FAQs soon.
Question: Any future timeline to share on this initiative?
Answer: The longer, in-depth videos of Parts 1, 3.1, and 3.2 are already available on our website and YouTube. We target at releasing similar in-depth videos for Parts 2.1 + 2.2 in September 2024, and Part 3.3 later in 2024. Regarding Part 4 and beyond, we are exploring several interesting directions.
Question: Any plans on code/data release?
Answer: We strongly believe in the importance of code and data sharing. However, as a small team with multiple priorities, we need to manage our time carefully. Refactoring code and obtaining legal approval need time, but we have limited people: only 1 programmer for Parts 1 and 3, and 0.5 programmer for Part 2 (Tian Ye is only 6mo/year at Meta; Zicheng is laid off by Meta).
For Part 1, the data (CFG trees) are included in the PDF, and generating from the CFGs requires only simple code, which we are not providing due to time constraints.Â
For Part 3, we have detailed how to generate the data, which involves simple random generation of names and employers. We cannot release the bioR data as it is difficult to human-verify all Llama outputs. However, we plan to release the bioS data and the prompts to generate the bioR data, though this will take some time.
For Part 2, we will release the code for generating the iGSM data after polishing it and obtaining legal approval. In the meantime, we have provided all necessary pseudocode in the released PDF paper to help readers understand the data generation process.
Question: Any collaboration possibilities?
Answer: At FAIR, we encourage collaborations with external researchers, especially students and professors from academia. However, due to company policy, we cannot share code or GPU resources. This means that unless you have your own GPU resources, we can only discuss ideas at a high level and not conduct experiments together. Currently, my only intern position is committed to Tian Ye, so no additional headcount is available. The only exception is for UW students, who can participate in a 2-year co-mentorship program with me. Unfortunately, no UW students showed interest in this project this year, so that headcount was gone. For more opportunities, my manager Lin Xiao oversees the CoreML group at Meta, and you can find the application details here: https://www.metacareers.com/jobs/613693480929044/.