FCS News

» Go to news main

Training in Big Text Data

Posted by CS Magazine on November 3, 2014 in Research, Graduate, News, Research, Big Data & Machine Learning

Originally shared in the Fall 2014 CS Magazine.

Big Text Data Analytics and Health Care

Lulu Huang started her undergraduate degree in China. After listening to a presentation by Dr. Phil Cox on joint degree opportunities, she decided to finish her degree at Dal. Although language presented a challenge and it took a little while to get caught up with classwork, the courses were very appealing to her. During Lulu’s last year of her undergraduate degree, she particularly enjoyed data mining, natural language processing, and completed a research project on text mining and visualization.

Originally, Lulu’s idea was to get a job after graduating. After strong encouragement and sound advice from Dr. Evangelos Milios and Dr. Stan Matwin, she became inspired to further her studies. When the opportunity arose to complete a master’s degree in big text data under the supervision of Dr. Matwin, she took it with enthusiasm and started her graduate studies in September. In collaboration with a major care provider company,

Lulu currently works on analyzing data from geriatric care patients, with the goal of understanding how care can be better delivered. For example, how can the fall of a patient be prevented or how can we determine who are most prone to falls in geriatric care? Lulu has quickly gotten used to living in Canada. “Halifax is very similar to the city I come from in south east China, except for the winters!” If, after graduating, a great opportunity came up she would probably stay in Canada. She says she might have to look very closely, though, at her location choice. “I couldn't live anywhere that was colder than Halifax!”

A Collaborative NSERC CREATE Program

The Institute for Big Data Analytics is pleased to help train Canada’s next generation of highly skilled innovators through Training in Big Text Data (TRIBE).

TRIBE is a Natural Sciences and Engineering Research Council of Canada (NSERC) Collaborative Research and Training Experience (CREATE) Program and is a joint creation between Dalhousie University, Simon Fraser University, and Université de Montréal that offers fully funded, competitive Masters of Computer Science and PhD, Computer Science scholarships. Dr. Stan Matwin, Canada Research Chair in Visual Text Analytics and director of Canada’s only Institute for Big Data Analytics, received the funding to develop training programs which provide young researchers with opportunities to help them make the transition from trainees to productive employees in the Canadian workforce.

CREATE program provides resources

The CREATE program provides leading research teams in natural sciences and engineering with the resources to implement an applied training environment that combines research knowledge and experience with the personal and professional skills needed in industry or government workplaces. Dalhousie has been very successful in competing for CREATE grants, receiving almost one each year there has been a competition.

Dr. Matwin and his research team will receive $1.65 million in CREATE funding over the next six years. The TRIBE program is based on five pillars: a structured curriculum combining the know-how, including “soft skills” in demand by industry; hands-on training in the form of an industrial internship; national and international student mobility and exchange; a multi-lingual application and training environment; respect for privacy as a value instilled in students and in the approach to big data. The combination of these pillars will provide unique, industrially relevant graduate training in an area of unmet high demand in Canada and globally. Students who fulfill all the course and internship requirements will obtain a Graduate Certificate in Big Text Data.

Industry-Relevant training

“This training will be achieved by students participating in focused, industrially-relevant research projects,” says Dr. Matwin. “In TRIBE, all students will do an industry internship, and most of them will spend some time doing research in one of our partner universities.”

The amount of data managed by organizations in North America and around the world has exploded in recent years. Dr. Matwin says analyzing large data sets will become a key area of priority in many different domains. A recent McKinley Report predicts a shortage of 140,000 data analysts in the US alone in the next four years. “TRIBE will provide advanced training in areas of high demand, producing graduates that will be in high demand,” says Dr. Matwin.

The CREATE grant will also help the Institute of Big Data Analytics attract topnotch students, and provide the institute with the resources to work on state-of-the-art research projects. “The partnership with TRIBE will increase the footprint of the Institute, as text data analytics is one of the main focuses of the institute and training personnel is one of its main priorities.”

“As only 15 projects were selected from the initial 120 applications, the CREATE grant is also a recognition of excellence towards the team we’ve put together, both at Dal and at our partner universities,” says Dr. Matwin.