Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
40 changes: 30 additions & 10 deletions docs/advanced/error-handling.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,13 +3,13 @@
The SDK throws specific exceptions to help you handle different failure scenarios:

```
Error
└── DurableExecutionError - Internal SDK control-flow/error base type
└── SuspendExecutionException - Internal signal used by the SDK to suspend execution
(for example during `wait()`, `waitForCallback()`, and
`waitForCondition()`). User code should not catch it.

RuntimeException
├── SuspendExecutionException - Internal control-flow exception thrown by the SDK to suspend execution
│ (e.g., during wait(), waitForCallback(), waitForCondition()).
│ The SDK catches this internally — you will never see it unless you have
│ a broad catch(Exception) block around durable operations. If caught
│ accidentally, you MUST re-throw it so the SDK can suspend correctly.
└── DurableExecutionException - General durable exception
├── SerDesException - Serialization and deserialization exception.
├── UnrecoverableDurableExecutionException - Execution cannot be recovered. The durable execution will be immediately terminated.
Expand Down Expand Up @@ -52,16 +52,36 @@ try {

### Handling SuspendExecutionException

If you have a broad `catch (Exception e)` block around durable operations, you must re-throw `SuspendExecutionException` to let the SDK suspend correctly:
`SuspendExecutionException` is an internal SDK control-flow signal. It extends `Error`, not `Exception`, so a

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and you're also updating these error handling docs? is this intended?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, obsolete docs. Should be updated in this PR or a separate one

normal `catch (Exception e)` block will not intercept it.

The real risk is code that catches `Throwable`, or code that explicitly catches `SuspendExecutionException`. In those
cases, you must re-throw it immediately so the SDK can suspend the execution correctly.

```java
try {
ctx.step("work", String.class, stepCtx -> doWork());
ctx.wait("pause", Duration.ofDays(1));
ctx.step("more-work", String.class, stepCtx -> doMoreWork());
} catch (SuspendExecutionException e) {
throw e; // Always re-throw — lets the SDK suspend the execution
} catch (Exception e) {
throw e; // Always re-throw internal suspension signals
} catch (Throwable t) {
log.error("Unexpected throwable", t);
throw t;
}
```

Avoid broad `catch (Throwable)` blocks around durable operations unless you have a strong reason to use them. Prefer
catching specific application exceptions instead:

```java
try {
ctx.step("work", String.class, stepCtx -> doWork());
ctx.wait("pause", Duration.ofDays(1));
ctx.step("more-work", String.class, stepCtx -> doMoreWork());
} catch (SuspendExecutionException e) {
throw e; // Always re-throw internal suspension signals
} catch (MyBusinessException e) {
log.error("Operation failed", e);
}
```
```
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@
// SPDX-License-Identifier: Apache-2.0
package software.amazon.lambda.durable.examples.general;

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import software.amazon.lambda.durable.DurableContext;
import software.amazon.lambda.durable.DurableHandler;
import software.amazon.lambda.durable.examples.types.GreetingRequest;
Expand All @@ -13,15 +15,16 @@
* in log entries via MDC. By default, logs are suppressed during replay to avoid duplicates.
*/
public class LoggingExample extends DurableHandler<GreetingRequest, String> {
Logger logger = LoggerFactory.getLogger(LoggingExample.class);

@Override
public String handleRequest(GreetingRequest input, DurableContext context) {
// Log at execution level (outside any step)
context.getLogger().info("Processing greeting for: {}", input.getName());
context.getLogger(logger).info("Processing greeting for: {}", input.getName());

// Step 1: Create greeting - logs inside step include operation context
var greeting = context.step("create-greeting", String.class, ctx -> {
ctx.getLogger().info("Creating greeting message");
ctx.getLogger(logger).info("Creating greeting message");
return "Hello, " + input.getName();
});

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -253,6 +253,14 @@ void testWaitAtLeastInProcessExample() {
assertTrue(asyncOp.getStepResult(String.class).contains("Processed: TestUser"));
}

@Test
void testLoggingExample() {
var runner = CloudDurableTestRunner.create(
arn("logging-example"), GreetingRequest.class, String.class, lambdaClient);
var result = runner.run(new GreetingRequest("TestUser"));
assertEquals(ExecutionStatus.SUCCEEDED, result.getStatus());
}

@Test
void testGenericTypesExample() {
var runner = CloudDurableTestRunner.create(
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,10 @@
import software.amazon.lambda.durable.model.WaitForConditionResult;

public interface DurableContext extends BaseContext {
static DurableContext getCurrentContext() {
return (DurableContext) BaseContext.getCurrentContext();
}

/**
* Executes a durable step with the given name and blocks until it completes.
*
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,10 @@
import java.util.function.Function;
import software.amazon.lambda.durable.config.ParallelBranchConfig;
import software.amazon.lambda.durable.model.ParallelResult;
import software.amazon.lambda.durable.model.SafeCloseable;

/** User-facing context for managing parallel branch execution within a durable function. */
public interface ParallelDurableFuture extends AutoCloseable, DurableFuture<ParallelResult> {
public interface ParallelDurableFuture extends SafeCloseable, DurableFuture<ParallelResult> {

/**
* Registers and immediately starts a branch (respects maxConcurrency).
Expand Down Expand Up @@ -68,7 +69,4 @@ default <T> DurableFuture<T> branch(
*/
<T> DurableFuture<T> branch(
String name, TypeToken<T> resultType, Function<DurableContext, T> func, ParallelBranchConfig config);

/** Calls {@link #get()} if not already called. Guarantees that the context is closed. */
void close();
}
Original file line number Diff line number Diff line change
Expand Up @@ -7,4 +7,8 @@
public interface StepContext extends BaseContext {
/** Returns the current retry attempt number (0-based). */
int getAttempt();

static StepContext getCurrentContext() {
return (StepContext) BaseContext.getCurrentContext();
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -3,17 +3,36 @@
package software.amazon.lambda.durable.context;

import com.amazonaws.services.lambda.runtime.Context;
import org.slf4j.Logger;
import software.amazon.lambda.durable.DurableConfig;
import software.amazon.lambda.durable.logging.DurableLogger;

public interface BaseContext extends AutoCloseable {
public interface BaseContext {
ThreadLocal<BaseContext> CONTEXT = new ThreadLocal<>();

/**
* Gets the current context (DurableContext or StepContext) for this thread.
*
* @return the current context or null if not set
*/
static BaseContext getCurrentContext() {
return CONTEXT.get();
}
/**
* Gets a logger with additional information of the current execution context.
*
* @return a DurableLogger instance
*/
DurableLogger getLogger();

/**
* Gets a logger with additional information of the current execution context.
*
* @param delegate the logger to wrap
* @return a DurableLogger instance
*/
DurableLogger getLogger(Logger delegate);

/**
* Returns the AWS Lambda runtime context.
*
Expand Down Expand Up @@ -46,7 +65,4 @@ public interface BaseContext extends AutoCloseable {

/** Returns whether this context is currently in replay mode. */
boolean isReplaying();

/** Closes this context. */
void close();
}
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,13 @@
package software.amazon.lambda.durable.context;

import com.amazonaws.services.lambda.runtime.Context;
import org.slf4j.Logger;
import software.amazon.lambda.durable.DurableConfig;
import software.amazon.lambda.durable.execution.ExecutionManager;
import software.amazon.lambda.durable.execution.ThreadType;
import software.amazon.lambda.durable.logging.DurableLogger;

public abstract class BaseContextImpl implements AutoCloseable, BaseContext {
public abstract class BaseContextImpl implements BaseContext {
private final ExecutionManager executionManager;
private final DurableConfig durableConfig;
private final Context lambdaContext;
Expand Down Expand Up @@ -109,4 +111,18 @@ public boolean isReplaying() {
public void setExecutionMode() {
this.isReplaying = false;
}

/** Returns a durable logger for this context. */
public DurableLogger getLogger() {
return DurableLogger.INSTANCE;
}

/** Returns a durable logger for this context. */
public DurableLogger getLogger(Logger delegate) {
return new DurableLogger(delegate);
}

public static void setCurrentContext(BaseContext context) {
CONTEXT.set(context);
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,6 @@
import java.util.function.BiConsumer;
import java.util.function.BiFunction;
import java.util.function.Function;
import org.slf4j.LoggerFactory;
import software.amazon.lambda.durable.DurableCallbackFuture;
import software.amazon.lambda.durable.DurableConfig;
import software.amazon.lambda.durable.DurableContext;
Expand All @@ -32,7 +31,6 @@
import software.amazon.lambda.durable.execution.OperationIdGenerator;
import software.amazon.lambda.durable.execution.SuspendExecutionException;
import software.amazon.lambda.durable.execution.ThreadType;
import software.amazon.lambda.durable.logging.DurableLogger;
import software.amazon.lambda.durable.model.MapResult;
import software.amazon.lambda.durable.model.OperationIdentifier;
import software.amazon.lambda.durable.model.OperationSubType;
Expand Down Expand Up @@ -62,7 +60,6 @@ public class DurableContextImpl extends BaseContextImpl implements DurableContex
private final OperationIdGenerator operationIdGenerator;
private final DurableContextImpl parentContext;
private final boolean isVirtual;
private volatile DurableLogger logger;

/** Shared initialization — sets all fields. */
private DurableContextImpl(
Expand Down Expand Up @@ -430,30 +427,6 @@ private static <T> T executeRetryLoop(
}

// =============== accessors ================
@Override
public DurableLogger getLogger() {
// lazy initialize logger
if (logger == null) {
synchronized (this) {
if (logger == null) {
logger = new DurableLogger(LoggerFactory.getLogger(DurableContext.class), this);
}
}
}
return logger;
}

/**
* Clears the logger's thread properties. Called during context destruction to prevent memory leaks and ensure clean
* state for subsequent executions.
*/
@Override
public void close() {
if (logger != null) {
logger.close();
}
}

/**
* Get the next operationId. Returns a globally unique operation ID by hashing a sequential operation counter. For
* root contexts, the counter value is hashed directly (e.g. "1", "2", "3"). For child contexts, the values are
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,10 @@
package software.amazon.lambda.durable.context;

import com.amazonaws.services.lambda.runtime.Context;
import org.slf4j.LoggerFactory;
import software.amazon.lambda.durable.DurableConfig;
import software.amazon.lambda.durable.StepContext;
import software.amazon.lambda.durable.execution.ExecutionManager;
import software.amazon.lambda.durable.execution.ThreadType;
import software.amazon.lambda.durable.logging.DurableLogger;

/**
* Context available inside a step operation's user function.
Expand All @@ -17,7 +15,6 @@
* {@link BaseContext} for thread lifecycle management.
*/
public class StepContextImpl extends BaseContextImpl implements StepContext {
private volatile DurableLogger logger;
private final int attempt;

/**
Expand Down Expand Up @@ -46,25 +43,4 @@ protected StepContextImpl(
public int getAttempt() {
return attempt;
}

@Override
public DurableLogger getLogger() {
// lazy initialize logger
if (logger == null) {
synchronized (this) {
if (logger == null) {
logger = new DurableLogger(LoggerFactory.getLogger(StepContext.class), this);
}
}
}
return logger;
}

/** Closes the logger for this context. */
@Override
public void close() {
if (logger != null) {
logger.close();
}
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@
import software.amazon.lambda.durable.exception.DurableOperationException;
import software.amazon.lambda.durable.exception.IllegalDurableOperationException;
import software.amazon.lambda.durable.exception.UnrecoverableDurableExecutionException;
import software.amazon.lambda.durable.logging.DurableLogger;
import software.amazon.lambda.durable.model.DurableExecutionInput;
import software.amazon.lambda.durable.model.DurableExecutionOutput;
import software.amazon.lambda.durable.plugin.InvocationEndInfo;
Expand Down Expand Up @@ -69,9 +70,10 @@ public static <I, O> DurableExecutionOutput execute(

var userInput = extractUserInput(
executionManager.getExecutionOperation(), config.getSerDes(), inputType);
// use try-with-resources to clear logger properties
try (var context =
DurableContextImpl.createRootContext(executionManager, config, lambdaContext)) {
var context = DurableContextImpl.createRootContext(executionManager, config, lambdaContext);
DurableContextImpl.setCurrentContext(context);
// use a try-with-resources to clear logger properties
try (var ignored = DurableLogger.attachContext()) {
return handler.apply(userInput, context);
}
},
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -20,10 +20,12 @@
import org.slf4j.LoggerFactory;
import software.amazon.awssdk.services.lambda.model.Operation;
import software.amazon.awssdk.services.lambda.model.OperationStatus;
import software.amazon.awssdk.services.lambda.model.OperationType;
import software.amazon.awssdk.services.lambda.model.OperationUpdate;
import software.amazon.lambda.durable.DurableConfig;
import software.amazon.lambda.durable.exception.UnrecoverableDurableExecutionException;
import software.amazon.lambda.durable.model.DurableExecutionInput;
import software.amazon.lambda.durable.model.SafeCloseable;
import software.amazon.lambda.durable.operation.BaseDurableOperation;

/**
Expand All @@ -47,7 +49,7 @@
*
* @see InternalExecutor
*/
public class ExecutionManager implements AutoCloseable {
public class ExecutionManager implements SafeCloseable {

private static final Logger logger = LoggerFactory.getLogger(ExecutionManager.class);

Expand Down Expand Up @@ -192,7 +194,8 @@ public Operation getExecutionOperation() {
* @return true if at least one operation exists with the given parentId
*/
public boolean hasOperationsForContext(String parentId) {
return operationStorage.values().stream().anyMatch(op -> Objects.equals(op.parentId(), parentId));
return operationStorage.values().stream()
.anyMatch(op -> op.type() != OperationType.EXECUTION && Objects.equals(op.parentId(), parentId));
}

// ===== Thread Coordination =====
Expand Down
Loading