Topic 2: Making sense of big data
Growing Bigger and More Accurate with GSBPM (part three)
Arbi Setiyawan, Ni Lessari, Hanik Devia
Very different from other countries has only one language, Indonesia has more than three hundred local languages having more than seven hundred dialects. Currently all descriptions census and survey variables in questionnaire, data entry program, metadata, and interviewer guide book are available only in single national language. Interviewer may not be able to accurately translate variable descriptions from single national language to local language and further to a particular dialect. This condition leads misinterpretation and low accuracy collected variable. We propose consolidation among local languages to produce an official statistics variables at National Statistics College in the context of statistical education. Consolidation will produce multilingual official statistics equipment mentioned above. Every year there are several hundred new students at National Statistics College from almost every leading local language. These are untapped resources are ready for this purpose. Data accuracy can be improved with multilingual descriptions variable. It will encourage a lot of information about a variable as much as local language but it will make data more accurate. There will be no biased variable because it has been explained in the local language. Generic Statistical Business Process Model (GSBPM) provides structured approach to arrive at more accurate data. A personal computer owned by every student offers far more ease and flexibility for review, validate, edit GSBPM sub-process during education. The academic campus has long standardized software to help for this purpose.
Topic 3: Evidence-based decision-making under risk and uncertainty
Economic Contiguity (part one) Physical Construction Cost
Dwi Jayanti, Diah Daniaty, Resti Fitri, Khoirotun Nisa
Law number 33 year 2004 on financial balance is further explained by Presidential decision that regional allocation fund is based on a fixed amount. If particular regions receive more than they deserve, other regions receive less than they expect. Allocated fund is weighted among others by index of physical construction cost (IKK). The author and co-authors propose a way for IKK be closer to reality. Selected components of physical construction in IKK is applied to all regions. It turns out that regions considered as rich may be in different physical cost structure from not so rich regions. Further for certain few important components cost may be subject to regional wealth comparison. We found at least one natural component in western region and one manufactured component in eastern region for possible revision. Current regional computerized raw data entry is standardized by single national custom made application software. Cut off point for entry acceptance is applied to all regions which sometimes violates regional specific characteristics. Once raw data is rejected, it is difficult to trace back the real raw data which may be closer to the truth. Therefore we rely on spatial contiguity to recover unentered rejected raw data. Theory of composite index suggest that a figure is not only produced for publication but also for analytical soundness. That is introducing another unfamiliar step for most regional employee and regional official. Since raw data is always available in head office, it is a feasible practice to evaluate regional raw data prior to aggregation in head office beyond most regional insight.
Fast and Frugal Trees in the Wild
Laura Martignon
A list of fast-and-frugal trees successfully used in different fields of application are described, presented and illustrated. Fast-and-frugal trees can be used as decision-making tools which operate as lexicographic classifiers, and, if required, associate an action (decision) to each class or category. They were introduced and conceptualized in 2003 (Martignon, et al, 2003; Martignon, Katsikopoulos and Woike, 2008) and are now systematically used in a variety of applied fields. Here a list of such trees used in specific domains is presented and illustrated with corresponding diagrams.
Topic 4: Statistics Education in the 21st century
Developing data competence in primary school
Daniel Frischemeier
In the present Data Science era, competent data handling is inevitably and important to become a responsible citizen. Therefore the development of data competence should be realized as early as possible in the curriculum. Our main idea is to introduce primary school students into real statistical projects and to experience the phases, problem, plan, data, analysis and conclusions of the PPDAC cycle on their own. Especially comparing groups, an important activity in statistics, takes into account the application of many fundamental statistical ideas like distribution, representation or variability and even at primary school level students can be engaged in such activities by offering pre-stages for formal comparison concepts like center and spread.