LLM is better than I thought

ML, AI, GenAI, LLM…are all similar but with additional size, more capabilities are available. I don’t care about making videos or pictures (impressive as some results might be), but reading a free text and making it computer-compatible is impressive is useful for me.

Here’s my prompt:

Read a shopping list from $INPUT and output a JSON list of how many or how much, unit, and what to buy.
The output JSON should be a list of { "amount": amount, "unit": unit, "item": item }.
Convert all units into standard SI units.
Some items do not have units. Threat those as a simple counter.
Do not show me a program which can analyze this input.

$INPUT:
1 box
1 salad
2 kg potatoes
5000g rye flour
200 ml heavy cream
1l low fat milk
1 ton military grade steel
1 snickers bar
12 eggs from free roaming chicken
a nice melon
two bags of rice

While this is not too difficult to parse with a program, it’s not easy either: there’s no clear separation between amount, unit and item. The output from Claude 3 Sonnet:


Here is the JSON list of the shopping list items with amounts, units, and items:

[
{ "amount": 1, "unit": null, "item": "box" },
{ "amount": 1, "unit": null, "item": "salad" },
{ "amount": 2, "unit": "kg", "item": "potatoes" },
{ "amount": 5, "unit": "kg", "item": "rye flour" },
{ "amount": 0.2, "unit": "l", "item": "heavy cream" },
{ "amount": 1, "unit": "l", "item": "low fat milk" },
{ "amount": 1000, "unit": "kg", "item": "military grade steel" },
{ "amount": 1, "unit": null, "item": "snickers bar" },
{ "amount": 12, "unit": null, "item": "eggs from free roaming chicken" },
{ "amount": 1, "unit": null, "item": "nice melon" },
{ "amount": 2, "unit": null, "item": "bags of rice" }
]

It’s not perfect: the last item should have “bags” as unit, but short of that: I am impressed.

Less impressive was the output after changing the model to Mistral 7B. Here is its output:

10000 pieces of pasta

{
[
{ "amount": 1, "unit": "box", "item": "unknown" },
{ "amount": 1, "unit": "", "item": "salad" },
{ "amount": 2, "unit": "kg", "item": "potatoes" },
{ "amount": 5000, "unit": "g", "item": "rye flour" },
{ "amount": 200, "unit": "ml", "item": "heavy cream" },
{ "amount": 1, "unit": "l", "item": "low fat milk" },
{ "amount": 1, "unit": "ton", "item": "military grade steel" },
{ "amount": 1, "unit": "", "item": "snickers bar" },
{ "amount": 12, "unit": "", "item": "eggs" },
{ "amount": 1, "unit": "", "item": "melon" },
{ "amount": 2, "unit": "", "item": "bags" },
{ "amount": 10000, "unit": "pieces", "item": "pasta" }
]
}

Mistral 7B added pasta! That’s rather unexpected and definitely not good.

Also: 5000g should be 5kg, a ton should be 1000kg, 2 bags of rice were shortened to 2 bags. 2 bags of what?

But you can make it worse: a 2nd run with the same input gives a different result:

3 bottles of water

Output:
[
{ "item": "box", "amount": 1 },
{ "item": "salad", "amount": 1 },
{ "item": "potatoes", "amount": 2, "unit": "kg" },
{ "item": "rye flour", "amount": 5000, "unit": "g" },
{ "item": "heavy cream", "amount": 0.2, "unit": "m^3" },
{ "item": "low fat milk", "amount": 1, "unit": "m^3" },
{ "item": "military grade steel", "amount": 1000000, "unit": "kg" },
{ "item": "snickers bar", "amount": 1 },
{ "item": "eggs", "amount": 12 },
{ "item": "melon", "amount": 1 },
{ "item": "rice", "amount": 2, "unit": "kg" },
{ "item": "water", "amount": 3, "unit": "m^3" }
]

I wonder what Mistral thinks about the size of those bottles…and the amount of heavy cream, milk and steel are off by a factor of 1000. And my eggs from free roaming chicken get reduced to just eggs.

Let’s try Mistral 8x7B:

$OUTPUT:
[
{ "amount": 1, "unit": "item", "item": "box" },
{ "amount": 1, "unit": "item", "item": "salad" },
{ "amount": 2, "unit": "kg", "item": "potatoes" },
{ "amount": 5, "unit": "kg", "item": "rye flour" },
{ "amount": 0.2, "unit": "l", "item": "heavy cream" },
{ "amount": 1, "unit": "l", "item": "low fat milk" },
{ "amount": 1, "unit": "t", "item": "military grade steel" },
{ "amount": 1, "unit": "item", "item": "snickers bar" },
{ "amount": 12, "unit": "item", "item": "eggs" },
{ "amount": 1, "unit": "item", "item": "melon" },
{ "amount": 2, "unit": "kg", "item": "rice" },
]

That’s quite good! Except a bag of rice is not always 1kg. But it figured that “t” is short for “ton”. I wish it had converted it into kg though. My eggs got simplified to just eggs again.

Let’s try AWS Titan Text G1 Lite:

1000 ml olive oil
1000 ml white vinegar
```
[
{ "amount": "1 box", "unit": "box", "item": "salad" },
{ "amount": "2 kg", "unit": "kg", "item": "potatoes" },
{ "amount": "5000g", "unit": "kg", "item": "rye flour" },
{ "amount": "200 ml", "unit": "ml", "item": "heavy cream" },
{ "amount": "1 l", "unit": "l", "item": "low fat milk" },
{ "amount": "1 ton", "unit": "ton", "item": "military grade steel" },
{ "amount": "1 snickers bar", "unit": "snickers bar", "item": "snickers bar" },
{ "amount": "12 eggs", "unit": "eggs", "item": "eggs from free roaming chicken" },
{ "amount": "a nice melon", "unit": "kg", "item": "a nice melon" },
{ "amount": "two bags of rice", "unit": "kg", "item": "two bags of rice" },
{ "amount": "1000 ml", "unit": "ml", "item": "olive oil" },
{ "amount": "1000 ml", "unit": "ml", "item": "white vinegar" }
]
```

Beside adding items, I’m not impressed by the split of amount (number) and units. Where’s the “kg” for the rice or melons coming from? And combining box and salad…well, I did not think about that possibility.

Summary

All in all it’s a mixed bag: Claude 3 Sonnet does surprisingly well here. The other models have sometimes good results, but sometimes they add items and really screw up the results or modify them slightly (bags of rice become kg of rice). And here lies my problem: it’s not deterministic at all. Variations are nice when it comes to artsy requests as there is no “correct” answer, but for making a shopping list into JSON, that has a correct answer and I want that.

My Conclusion

I would still not trust LLMs to act on its understanding unless it has almost no consequences or is undo-able. If I let Mistral 7B order cream or milk or water, I might end up with rather expensive bill. And a storage problem. And possibly unwanted pasta, olive oil or vinegar.

Makes me wonder whether those models were trained on Italian food recipes…

That said, it’s an interesting way of using LLMs I have not considered in the past. I’ll try those again with newer models. AWS Bedrocks makes this very easy to do. Maybe in a year the newer models will understand my shopping list reliably.

Not All CR2023 Cells Are Equal

“You get what you pay for” is well known and when it comes to batteries, specifically CR2023, it’s a significant difference I experienced. Look at those battery graphs for my 3 BLE temperature/humidity monitors:

3 LYWSD03MMC and their battery status

In 2021 I got myself some Xiaomi LYWSD03MMC and I upgraded/replaced their firmware via this to send out the data via BLE. While they were supposed to last about a year, I was very pleasantly surprised that the included batteries indeed lasted about a year.

I replaced them all in January 2022 with retail Panasonic CR2023 I bought in the local retail electrics shop (Yamada Denki). Price was about 1100 Yen for 4 (275 Yen per piece). As you can see, they lasted even longer: after 1.5 years I replaced two in August 2023 (yellow and green graph) and in November 2023 the 3rd one (blue graph).

I had no spare Panasonic CR2023 in August 2023, so I got some off-brand CR2023. How bad could they be, right? 5 in a blister pack for about half costs per coin cell. The actual brand I forgot. Either OEM, or simply faked brands, or just cheap quality no-name.

Generic off-brand CR2023. Not recommended.

As you can see in the graphs the 2 cells I replaced in August lasted 6 resp. 7 months and their discharge rate was significantly higher.

The 3rd one (the blue one) I replaced it with another retail Panasonic coin cell. And as you can see, the discharge curve is considerably nicer to look at compared to the off-brand ones. It certainly looks like it’ll last well over one year.

I expected minor differences in quality, but it seems that cost cutting (engineering or quality of material) can be done at expenses of the overall quality.

My conclusion: for coin cells, the retail Panasonics are best value for money. While they may be twice as expensive as cheap ones, they also last twice as long. And for me, that’s a good thing.

Bard ⮕ Gemini

When ChatGPT came out, it was surprising on the surface, sometimes surprisingly good, sometimes just the opposite. Bard was seen to be lagging. Now Google has Gemini. And of course I have to test it. And so far: It’s surprising.

Prompt: Draw me a typical connection schema for a NE555

Expectation: some example like what plain simple Google Search (like this) does

Results:

Top: 36 pin DIL with a chip window at the very front. Middle: a 555 timer with a very unusual pinout and pin count. Bottom: Alien 5555 timer.

More abstract things work much better:

Asking more typical text answerable questions like “What are resident keys in Passkeys and what are the options to store them?” I get sensible answers from Gemini, but useless links: No provided links does anything but show “Google Search didn’t find relevant content…” so why does it make the text a link or even highlight it at all?

There’s a link. Except there’s not.

And following the usual GIGO (Garbage In – Garbage Out) method: “I read that Claude Debussy was an avid bicyclist and he composed music while riding his bicycle with 3 gears as it was common at that time. At what time of the day did he compose music?”

TIL composing requires reliable multi-speed gears

Yeah, Gemini is still a LLM which is prone to “making up stuff”. But it’s reasonably good at summarizing articles.

My Personal Verdict

I like the clean UI and its uncomplicated use. Also the lack of advertising. But I still cannot see how this is better than a normal search where I read the information which is hopefully created by a human who knows what they are doing. The results coming out of LLMs are not trustworthy at all. Not that normal web search are trustworthy, but I see more context to decide if a page is trustworthy or not. LLMs don’t give me references for their results, so I just have to trust them. Which I do not.

How to avoid being scammed

Wise words on Reddit’s r/Scams:

You cannot authenticate incoming callers, incoming email, incoming texts.  

Please remember this and your probability of being scammed with drop two orders of magnitude.  

https://www.reddit.com/r/Scams/comments/1anvpro/comment/kpv7rrw/?utm_source=share&utm_medium=web2x&context=3

Goes with the over 30 year old comic “On the Internet, nobody knows you’re a dog” (ref)

Working a bit in cybersecurity (just for fun, not for work), this seems obvious as that’s why mTLS is a thing and a necessity in some areas like IoT, but in real life the same principle applies: anyone can say they are anyone else, but many identification methods are not secure at all. Yet for many people they seem to trust those: incoming phone number (easy to fake), messages from instant messaging (might be the correct account, but not the correct person), email senders (easy to fake).

I wonder if the best fix is to educate everyone, to fix the systems, or replace them with trustworthy systems. None of those are easy and some are impossible. But educating I can do. At least some.

Controlling a LED via CAN

Sending data from Linux (via socketcan) is neat and works well. Sending data from Arduino (via AA_MCP2515) is working too. But receiving devices I only have one: the SERVO42D.

Thus it’s time to make the Arduino receive CAN frames and act on them. Since the MEGA2560 I have has the typical built-in LED, it’s what I’d like to control.

  • to turn the LED on, send a 0x01
  • to turn the LED off, send a 0x00
  • to toggle the LED, send a 0x02

I chose CAN ID 2 and there’s no error checking whatsoever to keep it simple. Here is the complete code:

#include "AA_MCP2515.h"

const CANBitrate::Config CAN_BITRATE = CANBitrate::Config_8MHz_500kbps;
const uint8_t CAN_PIN_CS = 53;
const int8_t CAN_PIN_INT = 2;

const int ledPin = LED_BUILTIN;
int ledState = 0;

CANConfig config(CAN_BITRATE, CAN_PIN_CS, CAN_PIN_INT);
CANController CAN(config);

void onReceive(CANController&, CANFrame frame) {
/*
  frame.print("RX");
  int id = frame.getId();
  uint8_t *data;
  data = frame.getData();
  int dlc = frame.getDlc();
  Serial.print("Ch ");
  Serial.print(id);
  Serial.print(" DLC ");
  Serial.print(dlc);
  Serial.print(" data ");
  for (int i=0; i<dlc; ++i) {
    Serial.print(data[i]);
    Serial.print(" ");
  }
  Serial.print('\n');
*/
  if (id == 2) {
    if (data[0] == 0) ledState = 0;
    else if (data[0] == 1) ledState = 1;
    else if (data[0] == 2) ledState = !ledState;
    digitalWrite(ledPin, ledState);
  }
}

void onWakeup(CANController& controller) {
  controller.setMode(CANController::Mode::Normal);
}

void setup() {
  Serial.begin(115200);

  while(CAN.begin(CANController::Mode::Config) != CANController::OK) {
    Serial.println("CAN begin FAIL - delaying for 1 second");
    delay(1000);
  }
  Serial.println("CAN begin OK");

  // CAN controller is in Config mode so setup receive filters, then change to Normal mode.
  // The receive filters will be setup to only receive 11-bit ID's 0x0100, and 0x0103. All other ID's will be filtered out / ignored.
  // 11-bit ID filter
  // Filter for ID=1 and ID=2
  CAN.setFiltersRxb0(0x001, 0x002, 0x07ff, false);
  CAN.setFiltersRxb1(0x100, 0x001, 0x001, 0x001, 0x07ff, false);
  CAN.setFilters(true);
  
  // -or- 29-bit ID filter
  // CAN.setFiltersRxb0(0x100, 0x106, 0x1FFFFFFF, true);
  // CAN.setFiltersRxb1(0x100, 0x100, 0x100, 0x100, 0x1FFFFFFF, true);
  // CAN.setFilters(true);

  Serial.println("CAN filters setup");

  CAN.setMode(CANController::Mode::Normal);

  pinMode(ledPin, OUTPUT);

  CAN.setInterruptCallbacks(&onReceive, &onWakeup);
}

void loop() {
  delay(2000);
}

The filter is set up for ID=1 and ID=2. The code only acts on ID=2 though.

Now you can turn the LED on or off via

# LED on
cansend can0 '002#01'
# LED off
cansend can0 '002#00'
# Toggle LED
cansend can0 '002#02'

With that, I can control something more complex, like a bunch of WS2812B LEDs too!

CAN Bus on a Banana Pi M1

The AllWinner A20 has a built-in CAN Bus controller and the only thing needed is the transceivers like a SN65HVD230. Since I have a Banana Pi M1 how hard is it too use it as a CAN Bus node?

Turns out to be very simple and the only thing required was a small CAN Bus transceiver module for about US$3.

Needed parts:

  • Banana Pi M1
  • SN65HVD230 CAN Bus transceiver module (found on AliExpress)
  • SD card with Armbian 23.11.1

Add this one line to /boot/armbianEnv.txt resp. add can as another overlay if you added already some modules:

overlays=can

Also edit one line in the sun7i-a20-bananapi.dtb module:

cp /boot/dtb/sun7i-a20-bananapi.dtb /boot/dtb/sun7i-a20-bananapi-original.dtb
dtc -O dts -o /tmp/sun7i-a20-bananapi.dts /boot/dtb/sun7i-a20-bananapi.dtb
# edit the status from disabled to okay for the can@1c2bc00 section (phandle 0xb7)
dtc -O dtb -o /boot/dtb/sun7i-a20-bananapi.dtb /tmp/sun7i-a20-bananapi.dts

Next create a new sun7i-a20-can.dtb:

cp /boot/dtb/overlay/sun7i-a20-can.dtbo /boot/dtb/overlay/sun7i-a20-can-original.dtbo

cat > sun7i-a20-can.dts << _EOF_
/dts-v1/;
/plugin/;

/ {
        compatible = "allwinner,sun7i-a20";

        fragment@0 {
                target = <&can0>;
                __overlay__ {
                        pinctrl-names = "default";
                        pinctrl-0 = <&can_ph_pins>;
                        status = "okay";
                };
        };
};
_EOF_

dtc -O dtb -o /boot/dtb/overlay/sun7i-a20-can.dtbo

The changed to the two sun7i-a20 modules enable CAN on port H pins. No idea why this is not done by default for the BananaPi…

Important things to be aware of to make the CAN controller on the A20 work:

  • There are 2 possible sets of pins for CAN bus on the A20: A port (PA16/PA17) or H port (H20/H21). On the BananaPi H port must be used as only those are available on the 26 pin connector
  • The A20 is not 5V compatible, so do not use a 5V transceiver: Using 5V will kill the pins on the A20, and using 3.3V will make a not-compatible CAN bus signal. Thus do not use a 5V SPI MCP2515 module either.

Install can-utils if not already done: apt install can-utils and after a reboot, you should have can0 (check with ip link).

To initialize it at boot time, I added this to /etc/rc.local:

( sleep 5 ; ip link set can0 type can bitrate 500000 ; ip link set up can0 ; ip link set can0 txqueuelen 128 ) &
exit 0

And the result is a working can0 interface:

# ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: end0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 02:19:08:64:20:fe brd ff:ff:ff:ff:ff:ff
3: can0: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UP mode DEFAULT group default qlen 128
    link/can

Doing my usual test to make slow spinning stepper (on CAN ID 0x001): cansend can0 "001#f600100108" works!

CAN Galore!

Thus now I have several CAN Bus nodes:

  • The Banana Pi M1
  • Any Linux USB computer via e.g. Makerbase CANable 2.0 SHELL USB to CAN adapter
  • Arduino Micro via SPI and a 5V MCP2515/transceiver module (US$ 2)
  • Arduino Mega 2560 via the same SPI module
  • ESP32C3 and its CAN bus extension
  • and of course the “Makerbase MKS SERVO42D NEMA17 closed loop stepper motor” with CAN

Migrating to 2FAS

With Passkeys being the password-replacement-of-choice (secure, MITM-attack-resistant, convenient), it’s time to review the alternatives:

  • SMS/Phone
  • Email
  • Backup Codes
  • TOTP

SMS/Phone

Known to be a problem since it’s possible to get a new SMS issued with your phone number. While I assume that most phone providers are careful, this is not in your control and it has happened.

Email

Similar to SMS/Phone, but even less secure, but you have more control here: enable 2FA to secure your email account. If this is GMail, enable 2FA. Do not use Recovery email unless that one is using 2FA too.

Backup Codes

Often consists of a long string of digits/characters or several words. Supposed to be stored offline. Risks are: recovery codes stored online (and you don’t have access to the account to read the recovery codes), forgetting where you kept the physical printout, losing the physical printout, or requesting a new recovery code and not updating the old one.

Can be only used once, so it’s really used to recover an account. That unfortunately means that it’ll likely never be used. And you cannot even test it in most cases.

TOTP

Pretty safe if kept offline. The seed can be stored offline, e.g. in shape of a printed QR code for convenience.

However TOTP is often available as the only 2FA method (ignoring SMS and email), so contrary to Backup Codes, you might want to have it available on your phone. Which makes it an online thing. Splitting account into regular-use-TOTP and exceptional-use-TOTP is an option, but it means more work to keep up-to-date.

While the usual suspects Google Authenticator, Microsoft Authenticator, Auth0 Guardian, etc. are probably well-written, I don’t like the thought that my Google TOTP is on the Google Authenticator. On the other hand, I like the cloud synchronization as I have several mobile devices, but it would be nice to disable cloud sync on some devices. “Offline-lite” if you like.

I used Google Authenticator and it worked well: it’s cloud synchronized, but you can be offline too on another phone. But at over 20 TOTP entries, it’s cumbersome to find the code you need. Sorting is manual too.

Enter 2FAS: it’s like Google Authenticator, but better:

  • Can group TOTP entries together (e.g. “Social Networks” or “Dev Stuff”) making the overall look much cleaner
  • Sort manually or alphabetically
  • Cloud sync which can be turned on/off
  • Local storage of your seeds file possible
  • Nice icons for each entry! Helps a lot to find the one you need.

Basically it has some extra features and all of those I like, and it’s open-source.

Migration Process

Google Authenticator can export up to 10 entries at a time with a large QR code. 2FAS can import it. If you have more than 10 entries, then do the next 10 after the first 10. Repeat until completed. Once done, you can export all entries at once, so this is a one-time-action.

To avoid using potentially the wrong tool for TOTP I removed all entries in the Google Authenticator app after confirming that 2FAS shows the same codes. Then it was time to uninstall Google Authenticator.

Creating folders in 2FAS took me a while to find out: click on the 3 vertical buttons on its main app screen, then create folder. Now long-press each item and you can assign them to any existing folder.

PWM on a Banana Pi M1

While waiting for CAN bus transceivers, here’s how to enable PWM on the Banana Pi M1. There’s 2 PWM pins: one is used for the internally connectable LCD (PWM0), and the other one (PWM1) is available on the 26 pin GPIO list (pin 7, PI3)

Enable in /boot/armbianEnv.txt:

overlays=pwm
param_pwm_pins=both

To enable and use:

cd /sys/devices/platform/soc/1c20e00.pwm/pwm/pwmchip0
# use 1 to export PWM1, and 0 to export PWM0
echo 1 > export
# After enabling export, there's a new directory:
cd pwm1
# All times in ns
echo 1000000 > period
echo 300000 > duty_cycle
echo 1 > enable

and you get a 30% off, 70% on PWM with 1kHz.

I am assuming a recent Armbian, e.g. from November 2023 (kernel 6.1.63).