Sharing of Department Summer Internship 2019

Census and Statistics Department
CHENG, Wing, BSc in Statistics

In this internship programme, I was assigned to the Trade Statistics Branch (2), Electronic Trading and Manifests Services Section. This section mainly collects and analyses shipping data. Over the two months of the internship, I was basically responsible for three tasks. First, I researched and executed a text tokenisation algorithm used in Natural Language Processing, namely byte pair encoding. Second, I translated R code for text analytics (such as regular expressions and the tokenisation algorithm) into Python. Third, I performed quality assurance duties by testing a Python command-line program for text classification.

I initially struggled when I first encountered these tasks because I felt that I lacked the knowledge to complete them. Fortunately, my supervisor Sam was kind and patient and guided me to the right track. My coding skills greatly improved during the internship period. I was also exposed to a lot of deep learning code for Natural Language Processing and gained an idea of how such processing is performed, which increased my interest in this field.

I would also like to thank Professor Yau, who was kind and gave me some valuable advice on ways of learning. My task from him was to modify some LaTeX notes with a wide range of statistics material.

It was my pleasure to be a part of this internship programme. I would like to thank Sam, my supervisor in C&SD, and my supervisor Professor Yau, both of whom gave me many tips about learning and determining my plans for the future. After this internship programme, I have a clearer insight into what I want to do and what skills I need to work on to prepare for the future.

Census and Statistics Department
GUO, Erya, BSc in Statistics

During the internship period, I was assigned to the Science and Technology Section. This section is mainly responsible for collecting statistics that reflect public technology usage in Hong Kong and the status of innovation, which are essential factors in defining Hong Kong as a knowledge-based economy.

My first duty was assisting with the preparation of the latest issue of “Hong Kong as a Knowledge-based Economy”. I presented different types of datasets using various means of data visualization to clarify the tendencies and comparisons. I also learned how to compile a statistical publication to make it both objective and comprehensible. The second task was more profound, as I was asked to review the Inland Tax Ordinance and reconcile research and development (R&D) statistics. In this process, I learned to use Access VBA on large datasets, which was entirely new to me. With the assistance of my supervisor and other colleagues, I finally completed over 5000 revised datasets. Through these real-world tasks, I not only consolidated the knowledge that I had mastered at university, such as R programming and survey design, but also came to understand the entire survey process of the S&T section.

I also acquired a lot of knowledge from my supervisor at school, Professor WEI Yingying. Her current research field is image recognition and artificial intelligence (AI). This was the first time that I had stepped into the machine learning and AI area. I read through all of the dissertations she recommended and compiled four reports on these topics. I also optimised the web crawler for picture extraction from Google. That was a real challenge for me, but with my supervisor’s instruction, I was put on the right track and eventually came up with the working code. This broadened my horizons and helped me to learn more about the mechanisms of machine learning.

The programme was a very nice change from the structure of a regular course. From my perspective, there was a fair amount of work, to which I was able to apply the knowledge learned from books. My supervisors and colleagues were patient and helpful, always willing to provide suggestions when I encountered obstacles. I built deep friendships with them over the course of the internship. To sum up, the Professional Attachment internship programme is definitely a valuable opportunity for students who would like to pursue further study in Statistics and related fields.

Census and Statistics Department
HUNG, Fan Hin, BSc in Statistics

I am grateful to have had the chance to work in the Statistical Processing Systems Branch (SPSB) this summer. This section mainly develops software and IT systems for the Census and Statistics Department. I was responsible for three tasks: to construct an automatic address lookup programme using R, to build a user interface for R and to present the programme to users who seldom or never use R.

The first task was writing code to match some survey address inputs with records in a database. I did not encounter many problems with this task. Because some users are not familiar with R, I was asked to build a user interface for the R programme, which has a command-line interface. This was important because SPSB places user-friendliness as its top priority for all of the software development work that they do for C&SD. This second task presented the greatest difficulties for me. Fortunately, my supervisor was patient and gave me a lot of advice. He researched the problem extensively and suggested several libraries for me to try. Using these libraries, I successfully completed the task and presented the programme to colleagues.

The internship at C&SD gave me valuable work experience in the field of statistics. Through the internship, I was able to get a glimpse into the section’s working routine and techniques, and I had the chance to apply the knowledge that I had learned in college. The experience of constructing and presenting an entire project is very different from that of learning in college, but my supervisor and colleagues were patient and gave me valuable suggestions when I ran into difficulties. The Professional Attachment internship is definitely a valuable experience for students.

Census and Statistics Department
KUOK, Chio Ieng, BSc in Statistics

I am grateful to have been given the precious opportunity to work as an intern in the Census and Statistics Department. I was assigned to the Trade Analytics Section, which is mainly responsible for computing statistics for the merchandise trade. These figures are important indicators of the development of trade in Hong Kong.

My supervisor Ms Carly Yuk Ling Lai was very nice and introduced me to the project she was currently working on. The main task of the programme was to apply deep learning techniques to the classification of commodity codes with a commodity description. My exposure to this project was valuable because in running a neural network, there are stringent requirements on the size of the dataset and the amount of computational power. We were also assigned some desk research that enabled us to acquire knowledge about the conditions of merchandise trade in Hong Kong and how these statistics were being computed.

This work experience gave me a taste of what the work of a real statistician is like, especially because the projects undertaken at C&SD are demanding and call for the application of statistical theories in real-world projects. This also taught me about the difficulties that can be encountered when handling data in large-scale projects. I would like to express my gratitude to both of my supervisors, Ms Lai and Professor Philip Yam, who provided much reading material for me to familiarise myself with neural networks for my work in C&SD. I believe that all of the things I learnt in the course of this internship will be useful in my future career.

Census and Statistics Department
WONG, Man Him, BSc in Statistics

I am grateful to have had the opportunity to work in the Census and Statistics Department over the past two months. I was assigned to the Trade Analysis Section (2) of Trade Statistics Branch (1). This section mainly deals with external merchandise trade statistics but also conducts some customer opinion surveys.

The internship was comprehensive and gave me many new experiences. Because I was assigned to the trading branch, I gained a lot of knowledge about the trading system, not only in Hong Kong but also in its trading partners. From knowing how to complete the import/export declarations, classifying commodities by the Harmonized Commodity Description and Coding System (HS), and analysing the discrepancy of trading records, I have had a taste of what it is like to be a statistician.

I was honoured that my supervisors allowed me to investigate their applied model of text analytics, which is used for transforming the commodity descriptions on the Import/Export declarations to numeric vectors. From this I learnt some practical R computing techniques and some machine learning concepts.

Finally, thanks to the one-day course held by C&SD, I gained insight into the whole structure of the organisation. During this course, the instructors gave us brief but clear introductions to almost all of the branches of C&SD, and of the potential career paths and working conditions inside the Department.

Overall, the internship was interesting and fruitful, and I am thankful to the Statistics Department and C&SD for providing me with a wonderful experience.

Centre for Clinical Research and Biostatistics
CUI, Tianye, BSc in Statistics

My first task at CCRB was to analyse data from a survey and write a report on it. After cleaning the data, I explored the meaning behind it using some statistical techniques. With some excellent assistance from the CCRB staff and some of my own study, I learned how to write a survey report.

The second task assigned to me was to analyse a more complicated dataset. At first, I felt stuck and spent a lot of time fumbling around. When I turned to Professor Zee for help, he taught me that I needed to look deep into the dataset and focus on understanding the data themselves rather than just thinking about using the knowledge I had learned at school. This was enlightening, as I realised that I did not even know anything about the data collection, which was an important part of the analysis. It was exciting to see the seemingly meaningless numbers becoming meaningful once they had been properly organised and interpreted.

During the internship, I also had the chance to learn about statistical consulting for medical research. This equipped me with the principles for designing medical research and deepened my understanding of the statistical methods I had learned at school.

Another special experience was helping to introduce a product at the Hong Kong International Medical and Healthcare Fair. There were many technological products designed for the elderly or for disabled people, and it was inspiring to see how medical developments can serve to improve people’s standard of life.

It was a great pleasure to work at CCRB with Professor Zee and all of the kind staff there. Professor Zee not only taught me how to be a good statistician but also shed some light on my future career. The staff at CCRB were very nice and always willing to help. I greatly appreciate the opportunity our department provided for me to take up this internship because it will definitely prove helpful for my future.

Centre for Clinical Research and Biostatistics
CONG, Qing, BSc in Statistics

During my internship, a study was conducted in which subjects had their retinal images taken and their health information collected to examine their lifestyles and analyse their risk of stroke. The main task I was assigned was to help in collecting first-hand data from these retinal images. To complete the task, I was asked to use a statistical application called ImageJ. It was my first time working on a real project, and it was not done well at first. However, Professor Zee kindly explained to me how to go about it. He also taught me the importance of communicating well with others, because knowing what your supervisor required was important, and your colleagues may also provide useful suggestions and support.

In school, we seldom have a chance to collect data ourselves, as the professor usually gives us a dataset on which to do the analysis. After going through this process, I now understand that data collection is an important part of analysis because the quality of the data we obtained would affect the result.

I am grateful for the valuable opportunity to work at the CCRB this summer. The people I met there were very nice and taught me many things that the textbooks cannot teach. I had a great experience in the CCRB’s summer programme.

New Media Group
MAO, Ruiqi, BSc in Risk Management Science

This summer, I was pleased to have the opportunity to work as an intern in New Media Group. I was placed in the IT department, where my primary duty was to work with the customer database, performing clean-up, reconstruction, modelling analysis and other tasks. To help the sales department make marketing decisions and identify potential clients, I put ordinary databases through analysis modelling and calculated some values for them to use as a reference. This process required the use of some statistical techniques, thus matching what I had learned in college to the world of business practice. Furthermore, to help the sales department understand the analysis outcomes, I produced various dashboards to visualise the data. At the end of the internship, I did some research into the advertising algorithms of Facebook and Instagram. The information was not easily accessible, but I learned a considerable amount of specialised knowledge, and I hope that it will contribute at least a little to the team in their future work.

I was very fortunate to work with such a warm team. Although I was the youngest member and my Cantonese was not good at the beginning, they were always patient and looked after me. I will never forget the social activities, a happy hour, a colleague’s birthday lunch and my farewell dinner, all of which helped me to see different possibilities in life through talking to my colleagues. I would especially like to thank my leader, who not only taught me professional skills for work, but also provided inspiration for my future career and life plans. It was a precious opportunity and an enlightening experience, and I greatly appreciate the efforts of Department of Statistics and New Media Group in making the internship possible.

