As technological capabilities advance at an ever-increasing pace, the design of effective decision-making systems is becoming both more critical and more complex. Automated decision systems are employed in areas from consumer marketing to public policy to high-stakes fields like criminal justice and healthcare. Getting the architecture and algorithms right is crucial for producing reliable, ethical, and beneficial outcomes.
This article explores key considerations for engineering decision systems fit for our technological era. We examine how to properly model and transform information into a format amenable to automated reasoning. We look at cloud computing services that can provide the secure, scalable, cost-effective infrastructure to run decision algorithms at scale. And we assess approaches for integrating the opinions and perspectives of citizens and consumers to keep decision processes aligned with human values and preferences.
Modeling Information for Automated Decision-Making
At the core of any decision system is the data that serves as the input – the information that the system reasons over and transforms into an output decision or recommendation. How this input information is structured, represented, and fed into the decision engine is a key architectural consideration.
The old data maxim of “garbage in, garbage out” is especially relevant for automated decision systems. If the underlying data is incomplete, skewed, obsolete, or noisy, even the most sophisticated decision models will produce unreliable results. Data quality is paramount.
There are different approaches to data modeling depending on the decision domain, the type of data available, and the ultimate output needed. For certain types of structured decisions like scheduling or logistics, the inputs may be highly formatted and quantitative – numerical data, predictable entities and relationships, deterministic rules. In these cases, classical operations research modeling techniques like linear programming, integer programming, and constraint satisfaction problems can produce optimized solutions.
For more open-ended decisionsrequiring interpretation of unstructured, heterogeneous, qualitative data, other modeling approaches are required. This may involve transforming raw text, audio, images, and video into structured data representations that can be processed by machine learning algorithms. Techniques like natural language processing, computer vision, and dimensionality reduction are employed to extract features and represent real-world information in machine-readable vectors, embeddings, or graphs.
Human-centered design advocates argue that data modeling for decision systems impacting people’s lives should incorporate human inputs and oversight from the start. Iterative processes like design thinking and user experience research can help identify the key informational requirements and constraints to model. Contextualized data such as demographics, environmental factors, and personal preferences and values need to be accounted for in addition to the core decision variables. Algorithmic impact assessments and third-party auditing can promote responsible data practices.
Storing and Transforming Information with Cloud Infrastructure
Once data has been properly modeled and cleaned, the next consideration is how and where it will be stored, accessed, and computationally transformed into a final decision output. The rise of cloud computing platforms has revolutionized the infrastructure available for developing and deploying decision systems at scale.
Public cloud providers like Amazon Web Services, Google Cloud, andMicrosoft Azure offer a dizzying array of services and capabilities for ingesting, storing, processing, and serving information. Scalable cloud databases, data warehouses, and data lakesenable housing huge information repositories in differing formats – from traditional structured SQL tables to unstructured data objects.
Cloud platforms provide robust data transformation capabilities as well. Services for ETL (extract, transform, load), data preparation, feature engineering, stream processing, and batch analysis help covert raw data into representations amenable for model training and inference. Cloud marketplaces offer pre-trained models that can be fine-tuned on proprietary datasets.
And of course, once data is cleaned, prepped, and features engineered, clouds offer plentiful compute resources for spinning up clusters and containers to train and host decision models – whether traditional machine learning workflows or cutting-edge Deep Learning and other AI model architectures. The ability to dynamically provision and de-provision resources provides cost efficiencies versus operating static on-premises infrastructure.
While cloud-based data storage and compute provides compelling capabilities, it also introduces considerations around data sovereignty, security, privacy, and governance. Cloud providers have developed rigorous security models, but ensuring encryption and access controls While cloud-based data storage and compute provides compelling capabilities, it also introduces considerations around data sovereignty, security, privacy, and governance. Cloud providers have developed rigorous security models, but ensuring encryption and access controls are properly configured is critical when dealing with sensitive data inputs for decision systems.
There are also regulatory and policy constraints in certain jurisdictions and domains that require data residency – mandating information must be stored and processed within certain geographic boundaries. These requirements can limit cloud adoption or necessitate hybrid environments blending cloud and on-premise infrastructure.
From an architecture perspective, cloud services enable harnessing a range of advanced algorithmic capabilities offered through AI/ML platforms and data analytics tools. But there are also important criteria around algorithm selection and governance.
Algorithm Selection and Governance
At the heart of a decision system is the particular algorithm or machine learning model employed to derive insights and recommendations from data inputs. There is a staggering variety of techniques that could potentially be applied drawing from statistics, operations research, machine learning, deep learning, optimization, knowledge representation and reasoning, and other fields.
How is the optimal approach determined? In addition to predictive performance on test data, other key considerations include:
Algorithmic transparency and interpretability: Is the model a opaque “black box” or can its decision-making criteria and logic be inspected and understood? This is critical in regulated domains like finance, healthcare, and public policy.
Fairness and bias: Do the algorithms exhibit discriminatory biases against certain groups or demographics? There are emerging techniques in AI fairness to audit for unwanted biases.
Reliability and robustness: How well does the model perform on out-of-distribution data and edge cases? Is it brittle and can easily be fooled? Adversarial testing can validate reliability.
Updatability and retrainability: Can models be efficiently updated as new data arrives to prevent degradation and shortcut instability? Techniques like continuous learning and transfer learning can help.
Resource consumption: What is the computational cost of training and inference? Some models may be too complex to deploy at scale with reasonable cost and latency.
In many decision contexts, there is actually a plurality of models involved under an ensemble, federation, or mixtures of experts approach. Different component algorithms are employed for different subproblems, regions, or data distributions before final adjudication. This adds architectural complexity but can optimize for different objectives.
Other key governance concerns include algorithm provenance (where did a model originate and what data was it trained on?), monitoring for concept drift and data distribution shifts, and aligning algorithms with organizational ethics policies.
Responsible decision system design may involve layering different modular algorithms that apply successive filtering, constraints, and adjustments to accommodate distinct policy priorities, risk tolerances, and contextual criteria.
Integrating Human Inputs and Domain Knowledge
While automated decision-making systems provide compelling benefits in efficiency, scale, and analytical power, an often overlooked or underemphasized component is how to properly incorporate human inputs, oversight, and domain-specific knowledge.
There are a variety of techniques for integrating human feedback loops into decision system pipelines:
Machine teaching: Instead of training solely on data, having subject matter experts provide rules, guidance, and oversight on the learning process.
Interactive learning: Human-in-the-loop systems where domain experts can validate, modify, or reject model inferences and outputs to iteratively refine decision criteria.
Contextual policies: Defining higher-level constraints, overrides, business rules, and policy functions to align system decisions with organizational principles and priorities.
Human mimicry: Training initial models to replicate or emulate the decision process of leading subject matter experts before generalizing to wider deployment.
Another important consideration is how to incorporate the voices, opinions, and perspectives of citizens, consumers, and end-users impacted by the decision system. This can provide crucial signals about public preferences, values, and sensibilities.
There are a variety of channels where this public feedback data can be collected from:
Social media commentary: Aggregating and analyzing discussion and sentiment from platforms like Twitter, Facebook, Reddit, and other online communities.
Product reviews and e-commerce ratings: Distilling insights from customer reviews and ratings on shopping websites and platforms.
Survey data: Designing targeted surveys to solicit input on specific issues and proposals related to the decision context.
Public forums and consultation: Providing mechanisms for open-ended input and dialogue through online forums, town halls, call centers etc.
Conclusion
As technological capabilities explosively advance, the ability to design sophisticated yet responsible automated decision systems becomes both more critical and more complex. There are a variety of important factors that must be carefully considered:
How information is properly modeled, cleaned, and represented in formats amenable for machine learning and optimization algorithms. Ensuring data quality, validity, and context through human-centered processes.
Leveraging cloud computing services and AI platforms to provide the secure, scalable infrastructure for ingesting, transforming, and reasoning over large data repositories in a cost-effective manner.
Selecting and governing decision algorithms that balance predictive power with other crucial criteria like transparency, fairness, reliability, updatability, and performance.
Integrating domain knowledge and human feedback loops through machine teaching, interactive learning, policy rules, and other techniques to align system outputs with real-world constraints and human preferences.
And meaningfully incorporating the voices, opinions, and perspectives of citizens, consumers, and public stakeholders whose lives are impacted by the decisions.
Architecting automated decision systems is a multi-faceted challenge spanning data management, cloud computing, machine learning, human-computer interaction, policy, ethics, and more. But getting the socio-technical process right is critical as these systems increasingly influence decisions shaping our lives and societies.
References:
Amarasinghe, K., Kenthapadi, K., He, X., Crooks, D., & Stumm, M. (2021). AI governance: A holistic framework to analyze and guide AI governance design and implementation in organizations. arXiv preprint arXiv:2103.06992.
Arrieta, A. B., Díaz-Rodríguez, N., Del Ser, J., Bennetot, A., Tabik, S., Barbado, A., … & Herrera, F. (2020). Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. Information Fusion, 58, 82-115.
Barocas, S., Hardt, M., & Narayanan, A. (2019). Fairness and Machine Learning. fairmlbook.org.
Blasiak, A., Khmelevsky, Y., Cheung, J. C. K., & Wang, B. (2021). Rationalization of AI systems. arXiv preprint arXiv:2112.02408.
Brey, P., & Satakarni, N. (2020, May). Ethical challenges in data science. In NordSec 2020 International Conference on Secure IT Systems (pp. 3-14). Springer.
Dwivedi, Y. K., Hughes, L., Baabdullah, A. M., Ribeiro-Navarrete, S., Giannakis, M., Al-Debei, M. M., … & Wamba, S. F. (2022). Metaverse beyond the hype: Multidisciplinary perspectives on emerging challenges, opportunities, and agenda for research, practice and policy. International Journal of Information Management, 66, 102542.
Ehsan, U., Liao, Q. V., Muller, M., Riedl, M. O., & Weisz, J. D. (2022). Expanding Explainability: Towards Better Metrics for Explanations. arXiv preprint arXiv:2201.11634.
Gunning, D., Stefik, M., Choi, J., Miller, T., Stumpf, S., & Yang, G. Z. (2019). XAI—Explainable artificial intelligence. Science Robotics, 4(37).
Kozyrkov, C. (2021). Responsible AI Design. Proceedings of the IEEE, 109(8), 1256-1266.
Rahwan, I. (2023). Machine perspective. Nature Machine Intelligence, 1-3.