🤖

Nimbrights

A Robot Ranch

What is an LLM?

Large Language Models are revolutionizing how we interact with technology, but what exactly are they? Discover the fundamentals of machine learning, deep learning, and transformer architecture that power tools like ChatGPT, Claude, and Gemini.

Read More →

To understand LLMs, we need to start with the fundamentals: Machine Learning, Deep Learning, and a remarkable Deep Learning architecture called a transformer.

Understanding Machine Learning

Machine learning is a specialized branch of AI (Artificial Intelligence). Machine learning focuses on getting systems to learn from data. The systems are not told how to learn; they are shown lots and lots of data (billions or trillions of words from books, articles, and other text sources).

Diagram showing the relationship between AI, Machine Learning, and Deep Learning

Understanding Deep Learning and Neural Networks

Deep learning is a subfield of Machine Learning. This is a specific type of machine learning that uses neural networks to identify patterns in the data. A neural network is a set of interconnected nodes organized in layers: an input layer, a lot of middle layers, and an output layer. Each node is a mathematical function and the output from one node is input into the next node. The entire neural network is called a model, and an LLM is built on a special type of neural network architecture called a transformer.

An LLM is a deep learning model trained on a massive dataset to understand and generate text, images, music, etc., depending on the source of data. An LLM is trained on one or more of these specific data sets. For example, ChatGPT-4o was trained on text, images, audio, and video, making it a multimodal model (meaning it can process multiple types of data). Basically, an LLM is a statistical machine that repeatedly predicts the next word. LLMs technically forecast the next token, which might be words or parts of words. I will use "token" from now on. Image and music generation LLMs use similar processes but different architectures. This article focuses on text-trained models.

Why has AI, specifically LLMs, exploded into the public sphere?

Up to 2017, deep learning neural networks (models) evaluated words one at a time. In 2017, a team at Google released a paper about transformers. A transformer is a special neural network that can process all the text in parallel (something GPUs excel at). The nitty-gritty of transformers is that each word is associated with a matrix of numbers. The numbers encode the meaning of the word.

Transformers have a technique called attention. It allows matrices to communicate with one another and refine the meaning of words based on their surrounding context. So the matrix for the word "jump" will change to fit the context:

"We jump rope" versus "I need to jump the battery."

The matrix for "jump" will change for each context.

Transformers have another operation, called a feedforward neural network, which allows the LLM to store more words it learned during training.

The LLM iterates between these two operations —attention and the feedforward neural network — over and over again, so the word matrices become "enriched." Enriched means the matrices contain richer contextual information, capturing the subtle relationships between words. Here is a user prompt:

"It was a dark and stormy ___________."

The LLM will predict the next token. Each possibility has a percentage indicating how likely that token will be correct for the sequence.

Options:

  • night 78%
  • evening 12%
  • afternoon 5%
  • day 3%

So the most likely token for the prompt will be "night:" "It was a dark and stormy night." But sometimes the LLM will choose another token.

Why do LLMs produce different content for the same prompts?

The LLMs exhibit "emergent behavior" due to the transformer's two operations: attention and a feedforward neural network. It will choose a different token on occasion to seem more human. Why is this interesting? The LLM was trained on specific data, but this emergent behavior allows LLMs to choose between tokens, which is how we get different outputs despite asking the same question. This emergent behavior creates a challenge called the "black box problem" — we can see the inputs and outputs, but can't fully explain the reasoning process in between.

Conclusion

In a nutshell, an LLM is a sophisticated mathematical function that ranks tokens and typically selects the ones with the highest probability. An LLM is a powerful tool, but you will get better use out of an LLM if you treat it as a teammate rather than a tool. We use a hammer to hammer nails or remove nails. It is a one-and-done operation as befitting a tool. A one-and-done mindset with an LLM is not as helpful. When you use an LLM, iteratively provide feedback to refine the project you're working on. I guarantee you will have a more productive session with the LLM.

Collapse ↑

How to Talk to AI: Writing Better Prompts for Better Results

The emergence of LLMs has brought in what feels like a tidal wave of change. There are a bunch of LLMs you can run through your browser or run on your local machine. So we are encountering new terms, new subscriptions, and learning the special way to speak to the LLM. But what is prompt engineering?

Read More →

Prompt engineering focuses on crafting effective prompts that unlock the LLM's capabilities, enabling it to understand your intent, follow your instructions, and, hopefully, produce working or semi-working output. Prompt engineering is a new skill (part art, part science) to have because it plays a role in how you get you interact with the technology in a safe way.

The Prompt

The prompt is the input you provide: it can be a statement or question, a set of statements/questions, a set of instructions, or code snippets. Your prompt is the first hurdle to overcome when speaking with the AI because it influences the relevance and quality of the AI's output.

You will provide context by explaining your situation or providing examples so the AI understands your desires and generates the most accurate and relevant outputs. It is essential to give feedback to the AI as if it were a teammate, as this provides more context. The AI will know that it is meeting your needs and what it needs to do next.

Do not be vague when writing your prompts. Concrete ideas, instructions, and direct language work best to provide the best results. Do not make the AI guess your intentions, unless you want to play the Twenty Questions game.

Different LLMs or models may respond better to specific formats, such as simple commands, instructions, or natural-language questions and statements. Spend a few minutes investigating the model you're using to determine which format works best.

Prompting is a Conversation, Not a Command

Your first prompt is your starting point. That is not where things should end. If your prompt was complex, you will likely be unhappy with the first set of results you are given. Iteration is key to getting the results you want. Interact with the AI and provide feedback so it understands that specific goals have been achieved and that the focus has become refined to a subtopic or new topic. An example: "Remove the section on collaboration but expand on the technical details." If you want a specific format of the output, tell the AI "Put the article in Markdown. Put the data in a Markdown table with 4 columns." If you need an expert, tell the AI who to be - you are assigning the AI a persona or role to play. This can be quite helpful depending on your goals.

Types of Prompts

Direct prompts - you provide the model with a direction instruction or question—no other details.

Multi-shot prompts - you provide an example or multiple examples of input-output pairs, then present the prompt. You are giving a better context, so the AI can better understand the task at hand.

Chain-of-Thought prompts - you provide a chain of thoughts to the AI to ask it to break down complex ideas into a set of steps it can take to get the job done.

Direct Chain-of-Thought prompts - you directly ask the AI to perform reasoning steps (asking it to investigate a task or topic to build a set of instructions).

Putting it all Together: Examples of Prompt Engineering

  1. Creative writing

    Specify the genre, tone, style, plot points, characters, and situations to guide the AI in creating the story.

    Prompt: Write a short story of 5,000 words about a gnome who discovers a way to travel between trees. It's a fantasy with a friendly and optimistic tone. Use Terry Pratchett's style, minus the snarkiness. The gnome's name is Gwendolyn "Gwen" Thornberry; she's 25 years old and works as a scribe at the Library of Esterhelm. The tree she encounters is in a forest just outside the city of Esterhelm and is tended to by druids. Involve druids in the story. I don't know how she finds the tree or discovers its secret, so that's for you to figure out.

  2. Dialogue

    Prompt: "You are a friendly professor at an Ivy League university who excels at teaching and mentoring students in journalism. Please interact with me by asking questions and analyzing my answers. I would like to discuss journalism and its impact on society.

  3. Summarization

    Prompt: [First upload your text, or paste it into the prompt]. Look at the text I have provided and summarize the journal article on tidal energy and its production methods.

  4. Translation

    Prompt: Translate the following text from English into Spanish, French, and Arabic: "To live is the rarest thing in the world. Most people exist, that is all."

  5. Code generation

    Prompt: Create an HTML, CSS, and JavaScript application; do not use any frameworks. There are three centered buttons on the page: Comic Books, Graphic Novels, and Manga. When I select any of the buttons, a separate window appears that allows me to input the following data: Title, Publisher, Date of publication, ISBN, Price, Number of pages, Main character(s), Supporting character(s), Beginning, Ending, and Rating which shows five uncolored or transparent stars. Allow the user to select a specific number of stars. So if I choose 4 stars, 4 stars appear yellow. If I select two stars, the rating updates to the new number of stars that appear yellow. Name the application My Library of Books.

  6. Debugging

    Prompt: [Upload or paste code into chat box.] Look at the code I have entered. I am getting this error: "Uncaught TypeError: Cannot read property 'gen' of undefined at :1:45"

  7. Data analysis

    Prompt: [Upload data or paste data into chat box.] Analyze the data from the customer order database and identify trends in purchasing behavior. Present the analysis in text and pie charts.

  8. Report generation

    Prompt: [Upload data or paste data into chat box.] Create a weekly status report of what I have achieved over the past week. The tone is direct and formal. The purpose is to show management I am meeting the company's metrics on the number of stories completed. Include the following sections: Summary (a brief overview of the week), Milestones (what I have achieved), Roadblocks (the issues I've encountered while doing my work), and Next Steps (a list of specific priorities for the upcoming week).

  9. Personalized learning

    Prompt: I want to learn about gears. I'm interested in mechanical engineering and am curious about using gears to produce motion. Provide a broad overview of gears, how they work, and where they fit in mechanical engineering. [Then ask follow-up questions as you learn things.]

  10. Language practice

    Prompt: I am a third-year student learning Ukrainian. You are a native Ukrainian, and I will ask you questions about finding a book in a library and about researching folk tales. We will speak Ukrainian.

Quick Tips for better prompts

  1. Assign the AI a role: senior editor, high school math teacher, or a close, personal mentor
  2. Identify the desired length and format of the output, numbers are better than "long" or "short"
  3. Specify the audience
  4. Specify the format you want: text file, HTML file, markdown, table format
  5. Specify what you don't want: jargon, buzzwords, slang
  6. Use action verbs to identify the desired action
  7. Provide facts and/or data
  8. Define key terms/concepts
  9. Reference or upload specific sources
  10. Identify the amount of detail you want
  11. Break down complex tasks into individual steps
  12. Specify the level of detail you want
  13. Try different keywords or phrases
  14. Encourage the AI to perform step-by-step reasoning
  15. Ask the AI to explain its reasoning
  16. Guide the AI through a list of instructions or a sequence of thoughts

Conclusion

AIs are useful. I like to think of them as teammates working together on a project. This means collaboration, feedback, and adaptability are in constant use as the AI and I iterate and build the solution, whether that is interview prep, practicing a difficult conversation I need to have with a family member, or getting feedback on an article I've written. Take some time to practice your prompts with knowledge you know, which will help you understand the strengths and limitations of AI.

Collapse ↑

Understanding the CIA Triad: The Foundation of Information Security

The MGM ransomware attack in September of 2023 violated all three principles of the CIA triad: Confidentiality, Integrity, and Availability. Personally identifiable information (PII) was compromised, violating confidentiality; information was modified, violating integrity; and several systems were disabled, violating availability. What is the CIA triad, and why is it so crucial to cybersecurity?

Read More →

The CIA Triad is the foundation that guides all security decisions. Critical information must remain confidential, be accurate, and be available to users. This framework provides a lens through which organizations can identify and address security gaps across each concern. It is a framework to protect data. Meeting all three principles strengthens an organization's security posture, improving its ability to respond to threats and incidents.

Confidentiality

Data can be classified in many ways. The US military and the US government use Confidential, Secret, and Top Secret designations to identify their data, while a civilian organization may use Public, Private, Confidential, and Restricted data. Confidentiality is about authorized access to individuals and systems. This is done by controlling data—who has access, when they have access, and policies for moving data within and out of the organization.

There are several ways confidentiality can be compromised: direct attacks and unintentional (human error) violations. Direct attacks target systems the attacker doesn't have the right to access, using man-in-the-middle attacks, cross-site scripting, password cracking, or gaining administrative access. Human error occurs through misunderstanding or through insufficient security policies and controls. Users may share passwords, lose a laptop, or install unvetted software that has been compromised.

Healthcare and financial data from credit card companies, merchants, and banks are necessary and must be protected. In some cases, these types of data are subject to the laws and regulations of the countries in which they are used, so it is essential to understand that there is often legal liability involved in protecting sensitive information.

Protecting confidentiality involved a multilayered strategy, including data classification based on sensitivity, strong access controls and policies, end-to-end encryption, digital loss-prevention products to protect information sent outside the company, and various authentication procedures, such as passwords, multi-factor authentication, key cards, or biometric logins.

Integrity

Is your data trustworthy? Or has it been tampered with? Data integrity means it is authentic, accurate, and reliable. Users and customers need to know that the organization's data is correct.

Data integrity can be altered intentionally and unintentionally. Both violations must be protected against using hashing, encryption, digital signatures, or digital certificates. On the web, organizations can register with a certificate authority (CA) to verify the authenticity of their website, so visitors know it is a legitimate site. On the web, this is often seen as the lock icon in a browser's address bar, indicating the site is using a valid certificate from a trusted CA.

A critical option gaining popularity worldwide is the use of digital signatures. Digital signatures are used in email to prove that an email originated with its owner. It also provides non-repudiation, which means the email cannot be denied. There is digitally signed software code to verify the code's authenticity. The trust relies on certificate authorities and their processes for creating and maintaining certificates, while also providing a mechanism to delist expired or compromised certificates.

These mechanisms are often used as part of a defense-in-depth strategy, where multiple layers work together to ensure data integrity.

Availability

If the data is unavailable when needed, the value is diminished, and business operations can halt. The entire system — network, computers, applications, and services — must be operational so that the data is available for use. Otherwise, the organization suffers reputational harm and possible financial harm.

Availability issues can stem from many factors, including denial-of-service or distributed denial-of-service attacks, hardware failures, misconfigurations, misuse, and natural events such as tornadoes and hurricanes. It is essential for the organization to create disaster recovery plans for the various incidents it may face, and to create a separate business continuity plan so it can restore its services as quickly as possible in a thoughtful, planned way. A key consideration is the impact of downtime on business operations.

Organizations can use a hot site (full replicas of the work environment), a warm site (the primary necessities are there, but need setup/configuration, or a cold site (often an office with furniture but no technology). Redundant networks, servers, databases, and applications can be set up so that when the primary system encounters a problem, the troubled system fails over to the working system. A robust data backup policy—including off-site storage, regular testing of backups, and actionable recovery plans with clear communication channels—is fundamental to ensuring availability.

Conclusion

Different industries prioritize the CIA differently depending on their needs: healthcare = confidentiality, financial = integrity, and streaming = availability. The CIA triad is a decision-making framework to address risk in organizations. The triad is a valuable tool for threat modeling, vulnerability management, developing software security requirements, and incident response planning. Ultimately, the CIA triad is not just a theoretical concept; it is a practical framework for building and maintaining a resilient security posture in an organization.

Collapse ↑

Quantitative Risk Assessment: Choosing the Right Approach

It's time for the annual risk assessment at the organization: the CISO will need numbers to justify the security budget to executives. The risk assessment allows the team to determine and rank the risks to the organization, as well as identify the best methods to control the risks. There are two primary methods for conducting a risk assessment: quantitative and qualitative. It is essential to understand the strengths of each so they can be applied in different scenarios.

Read More →

A risk assessment (or risk analysis) is a process for identifying and evaluating risks. Risks are then quantified based on their impact severity, and then the risks are ranked in order of most importance. What is the difference between risk management and risk assessment?

Risk management is a continuous process that the organization undertakes to proactively identify and mitigate threats, adapt to the changing technical landscape, and achieve its objectives. This cycle ensures that new risks are discovered, existing ones are reassessed, and the effectiveness of controls is regularly checked. This is not a one-time activity. Risk assessment is done at a specific point in time and is considered complete. However, risk assessments should be done at least annually or when a control changes. These methods are not interchangeable; they are applied in different scenarios depending on the availability of historical data and the need for objective financial metrics. The quantitative risk method will be addressed in this article. Qualitative risk assessment will be covered in the next article.

What is Quantitative Risk Assessment?

A quantitative risk method uses numerical values, such as dollar amounts, to identify key terms in the risk assessment. Data is gathered from company records and entered into standard formulas. The results of these calculations help identify the priority of risks. Also, the results can be used to determine the effectiveness of controls.

Some of the key terms are:

Single loss expectancy (SLE) - the total loss expected from a single incident. An incident occurs when a threat exploits a vulnerability.

Exposure factor (EF) - the percentage of loss a threat event would cause to an asset.

Asset value (AV) - the value of the asset to the business (e.g., customer database, web server).

Annual rate of occurrence (ARO) - the number of times an incident is expected to occur in a year.

Annual loss expectancy (ALE) - the expected loss due to an incident.

Safeguard value - the cost of a control (e.g., a backup solution).

Term Formula Description
SLE AV * EF The cost of a single adverse event.
ARO (Based on historical data) How many times the event is expected to happen in a year.
ALE SLE * ARO The total expected cost from a risk over one year.

An example: The company issues laptops to its employees at a value of $4000 each. This includes the hardware, software, applications, and data. Last year, the company lost an average of 3 laptops. Fifty laptops are in use.

Asset value: $4,000
Exposure Factor (EF): 100% or 1.0
What is the SLE? Single Loss Expectancy is $4,000 * 1.0 = $4,000

What is the ARO? Annual Rate of Occurrence is 3, based on historical data.

What is the ALE? Annual Loss Expectancy is $4,000 * 3 = $12,000

The company considers buying hardware locks for the laptops at a cost of $20 each. The company estimates that the ARO will be 1 laptop lost per year after the control is implemented.

ARO with the control: 1
ALE with the control: $4,000 * 1 = $4,000

Safeguard value (cost of control): $20 * 50 = $1,000
Savings with the control: $12,000 (Current ALE) - $4,000 (ALE with control) = $8,000
Realized savings: $8,000 (Savings) - $1,000 (Safeguard value) = $7,000

The cost-benefit analysis (CBA) shows a net savings of $7,000 in the first year, indicating the locks should be purchased.

Weaknesses of Qualitative Risk Assessment

The quantitative method is time and resource-intensive and is best used in mature programs or programs with regulatory requirements. It also requires that historical data exist, which might not be the case. In some instances, the team may have to project into the future, which can compromise the accuracy and undermine the credibility of the numbers. There's also a chance of something called false precision, where numbers can seem more accurate than they actually are in the real world. Intangibles, such as organizational reputation and customer goodwill, can be challenging to quantify.

Benefits and strengths of Quantitative Risk Assessment

A benefit of the quantitative method is that it reduces to a simple math problem, but it is essential to note that gathering input data is time-consuming and challenging. If automated tools are used for the assessment, these values will be calculated by the tool. A second benefit is that the method provides a Cost-Benefit Analysis (a process to determine how to manage risk). The formulas used are verifiable and objective, lending credibility to the derived numbers. It is easy to compare different risks financially, and the numbers can be used to justify the return on investment for security investments. Crucially, by presenting risks in monetary terms, this method provides a common language that resonates with executives and board members, making it easier to justify security investments.

Conclusion

Quantitative risk assessment is a numbers-driven process, and the results are easy to understand. While this method has weaknesses, its strengths really shine when used in organizations with mature processes and historical data sets. The end goal is to support risk management decisions in managing the risks facing the organization. This approach may very well work for your organization; if not, read the following article on qualitative risk assessment and how it differs from the quantitative method.

Collapse ↑

Qualitative Risk Assessment: Choosing the Right Approach

Since I've already covered quantitative risk assessment and its strengths and weaknesses in the last article, I'll jump right into qualitative risk assessments. This method is perfectly suited for new programs, initial assessments, or organizations with limited resources. Simply put, a qualitative risk assessment (or analysis) does not use dollar values. Instead, it determines the risk level based on a subjective evaluation of a risk's probability and impact, using categories such as Low, Medium, and High.

Read More →

What is Probability?

This is the likelihood that a threat will exploit a vulnerability. The actual risk occurs when the threat exploits the vulnerability, a key distinction. A scale, such as low, medium, or high, is used. Percentage values will be assigned to these values, for example: low is 10%, medium is 50%, and high is 100%. You can use relative numbers if you wish. The probability scale in this article will use the values 1, 2, and 3.

What is Impact?

Impact is the negative result if the risk occurs. It identifies the magnitude of the risk. Remember that an exploited risk results in a loss, such as downtime. Again, we will use the words low, medium, or high to quantify the loss. This number is expressed as a relative value: low = 10, medium = 50, and high = 100. The impact scale in this article will use the values 1, 2, and 3.

The risk level is calculated with the formula:

Risk Level = Probability * Impact

The risk assessment team needs to define the probability and impact scales before it begins work. Low, medium, and high may work for one company, while another chooses the scale: slight, slightly moderate, moderate, moderately severe, and severe. Assess your organization's needs to determine what works best for the situation.

Probability Scale

Probability Description Value
Low The risk is unlikely to occur. 1
Medium A moderate chance exists. 2
High A high probability exists that the risk will occur. 3

Impact Scale

Impact Description Value
Low If the risk occurs, it will have minimal impact. 1
Medium If a risk occurs, it will have a moderate impact. 2
High If a risk occurs, it will have a high impact. 3

Here's the equation again: Risk Level = Probability * Impact. The resulting score ranges from 1 to 9.

Prioritize the Risks

For each risk found, use the formula and the scales to determine your numbers, then rank the risks in descending order. It is essential to address the most critical risks first, since money and time are limited resources the company has to spend. A risk matrix can be used in an x-y coordinate system to plot individual risks and show their impact.

Example Risk Matrix

After calculating the risk score (Probability x Impact), you can plot it on a matrix like this to visualize priorities. The scores are categorized into High, Medium, and Low risk levels.

Risk Level = Probability x Impact
Impact
Probability
High (3) 3 6 9
Medium (2) 2 4 6
Low (1) 1 2 3
Low (1) Medium (2) High (3)

Evaluate the Effectiveness of Controls

Now that there is a list of high-impact and moderate risks, a mitigation choices survey can be conducted with data, hardware, and software owners, as well as the InfoSec team. Through this process, experts are surveyed to identify relevant risks and determine the necessary controls. This data allows the risk assessment team to calculate the effectiveness of specific controls on mitigating those risks.

For example: consider the risk of a web server. Before the control, the Probability = 2, and the Impact = 3, resulting is a risk score (2 * 3) = 6. The risk assessment team proposes setting up a redundant server for failover. This doesn't change the probability of the first web server failing, but it dramatically lowers the impact. The new Impact = 1. So the new risk score is 2 * 1 = 2 (Low).

The Limitations of the Qualitative Risk Assessment

The most significant limitation is the two scales: probability and impact. They are subjective. The differences of opinion in creating the scales are a hurdle for the team. The second limitation is that there are no standards to use. To be effective, the sales must be tailored to the organization's specific needs, preferable designed by someone with experience in risk assessment. A third limitation is that the assessment does not include a cost-benefit analysis. The values derived for the evaluation are based on expert opinion, but the results may not be precise enough for management. The credibility of the entire assessment depends on the experience and objectivity of the organization's experts.

The Benefits of the Qualitative Risk Assessment

There are several primary benefits: it uses the opinions of experts, is easier to complete, and uses words instead of numbers, which are easier to understand. As long as experts are available for a survey (possibly several surveys), the data is easy to collect and the risk assessment is easier to complete. The scales are more understandable to everyone, so special knowledge is not needed beyond the technical requirements to understand the systems and business processes in place.

Conclusion

In conclusion, qualitative risk assessment provides a simpler framework for prioritizing threats when historical data is unavailable. Leveraging expert knowledge to rank risks and the effectiveness of controls allows the organization to make informed decisions. Risk matrices are a powerful graphical tool for presenting assessment results, and this risk assessment method provides a clear, shared understanding of the organization's most significant threats. This approach is invaluable for communicating priorities to technical teams and to management.

Collapse ↑

Business Impact Analysis: Identifying What Matters Most

Coming soon.

Read More →
Collapse ↑